PointFire Search Summarizer Adds Query Rewriting to Improve SharePoint Search

A New Milestone:Smarter Queries and Results
After months of R&D, we are releasing the second part of PointFire Search Summarizer: query rewriting, sometime known as query reformulation. The app now supports both improvements to the query before it is launched and improvements to the results that are returned. And it does this with very little latency, in fact this new functionality doesn’t require any additional calls to the Large Language Model.
It also uses the same philosophy as before:don’t replace the search process with AI, enhance it using AI, and do so in a way that preserves transparency. It tells you exactly what query is being used, and uses your judgment not the machine’s.
What ProblemsDoes Query Rewriting Solve?
Let’s look at what problems it addresses and how it improves the query.
1. Synonyms: for example, it expands biodiversity to ‘biodiversity OR ("biological diversity" OR "ecosystem diversity" OR "species diversity")’
2. Alternate words, for example “sweater and pants and sneakers” is changed to '(sweater OR jumper OR pullover) AND (pants OR trousers OR slacks) AND (sneakers OR trainersOR "athletic shoes")'
3. Acronyms:expands “NYC LIRR” to ‘("NYC" OR "New York City" OR"NY") AND (LIRR OR "Long Island Rail Road" OR "LongIsland RR")’
4. Typos:corrects misspellings, either a straight replacement or an “OR” of typo and correction if uncertain
5. Stemming, grammatical variants like plural and past tense is generally not required
6. Knowledge of KQL operators: expands “habitat loss”to “habitat NEAR (loss OR decline OR destruction OR degradation)”
7. Natural language queries: stopwords
8. Natural language queries: properties
9. Knowledge of metadata expressions
Understanding Each Improvement in Detail
Synonyms andContext Awareness
There is a lot more to synonyms that just a list of interchangeable terms for individual words. For example, the word support might be replaced with '(support OR "load bearing" OR "reinforc*") when used in one context, and (support OR backing OR fund* OR "capacity building") in another context. This is the polysemy/homography problem, to use the technical terms. To figure out the synonyms, you have to narrow down the sense of the word, and other words in the query help to do that. But in addition you have to look at groups of words that together have a meaning. For example, to find synonyms for “poor state of repairs” you can’t just provide synonyms of individual words, you have to know when a sequence of words has synonyms made up of one or more words, like “decrepit”. You also have to know when word order does and doesn’t matter, for example “egg prices” can be replaced with “egg* NEAR price*”, to cover sentences that contain “the price of eggs” or “the price per dozen eggs”
Alternate Words Across Regions
Alternate words are similar to synonyms. English terms are different indifferent countries, and the average person may not be aware. You may know that words like colour are spelled color in the US, and an American may be aware that what they call an apartment others call a flat, and an elevator is a lift, but would they know about rutabagas and swedes, arugula and rocket, zucchini and courgette, and many others?
Acronyms and Context Sensitivity
Acronyms are like a different type of synonyms. Again, it is useful to have context, and a straight substitution is not enough. For instance, “IP” can stand for“intellectual property” or “internet protocol” depending on the other words in the query. “MS” Can be"manuscript," "Master of Science," or "multiple sclerosis".
Handling Typos and Uncertain Spelling
Typos are simple enough to correct, but spell checkers aren’t smart, and they don’t know the probability of the typo being a real term. When creating a search query, some typos can be safely just replaced. But others, is there a chance that it’s not a typo? If someone types “Kernel sanders”,they are probably talking about fried chicken and don’t know how to spell“colonel”. But isn’t it possible that“Kernel” is a brand of abrasives (yes it is)? Alternatively, isn’t it possible that there a machine that scarifies seeds and kernels to help them germinate(yes there is but it’s not called that)? In case of doubt, it should use “OR” to allow both the original and the corrected versions, just in case.
Stemming and Morphological Variants
Stemming is something that SharePoint search and Microsoft search are quite good at in most languages that require stemming, including irregular forms. Soin theory it’s not required for most Indo-European, Semitic, and Finno-Ugric languages, which includes most European languages. However there are exceptions, and it is used in our other language processing, so we do let the LLM do stemming. An example of stemming is that creative, creativity, create, and creator are stemmed to: “creat”.
KQL Operators and Advanced Query Logic
Many people don’t realize that you can useKQL (keyword query language) expressions in SharePoint search boxes. And even if they did, they may not be comfortable with the syntax. Even AND,OR, NOT, and parentheses can be intimidating, much less proximity operators like NEAR(3) and ONEAR. You can’t expect users to learn this syntax.
Natural Language Queries: Stopwords
Users would prefer to use natural language queries like “show me the 2022 annual report”. This would work with Google and other search engines, and there was a time when it worked in SharePoint too, but that functionality was removed in 2024. Now it will look for documents that include the words “show” and “me”. A smart query rewriting function would remove these “stopwords” and not include them in the search.
Natural Language Queries: Metadata and Intent
Natural language queries might also include the intent to use metadata restrictions, rather than content search? For example “the 2022annual report” is a document with those words in the name, but it might have been written in 2023, while “show me status reports from 2023” might be replaced with '(status report OR ("status" NEAR report) OR (statusNEAR update)) AND Created:2023-01-01..2023-12-31'. In this case, it has figured out that documents have a “Created” property, and used the correct syntax for constraining a date type property to a range of dates.
Knowledge of Metadata Expressions
This knowledge of metadata expressions is very useful for translating the intent of a natural language query into a KQL expression. For now, we are focusing on common metadata properties, with fine-tuning required if you want to teach it about your custom metadata.
How Is PointFire Query Rewriting Different from M365 Copilot Search?
How is this different from Copilot search? Copilot search is actually a hybrid search. When a search query comes in, it is processed with simple query rewriting using a small language model, probably mostly to remove stopwords, and then sent in parallel to a keyword-based search engine based on MicrosoftSearch, and to a semantic index, where it is converted to a vector and compared with vectors found in the index, using a measure of lexical similarity. This parallel search retrieves two lists of matching documents, which are combined and ranked. One list will have matching keywords, and the other will have similar documents that use different words. An LLM decides which search results from the combined list are relevant.
Transparent and Reproducible vs. Black-Box Search
This is different from query rewriting, because the Copilot process is mostly a black box. You don’t know exactly what query was sent, nor how results were ranked, and which ones it discarded as irrelevant. The process is not necessarily reproducible, since LLMs have a random component. Query rewriting like what is used inPointFire Search Summarizer, on the other hand, is transparent. You see the exact query, and you are shown every hit. It is more work for you because it uses your judgment not its own, but PointFire Search Summarizer also presents a summary of in what way each document is relevant (or not), and which sentences are particularly pertinent, and it lets you decide for whether it is relevant. It also makes better use of metadata, something that Copilot still has trouble with.
Curious about how this works in practice?
- Email us as sales@icefire.ca for any questions
- Download the free trial on Microsoft AppSource



