ML-powered semantic search in eCommerceSerge Korzh
9 January 2020
Reading Time: 5 minutes
There are many crucial components to an eCommerce shop, but one is indispensable – product search. Unless you have only a handful of products to sell, you cannot really run an eCommerce shop without it. Searching for a product is the key step to a purchase – if a customer cannot find what they want, they will abandon the shop, even if it actually has relevant products on offer. Taking this into account, it seems like improving the search functionality should be one of the top priorities for an eCommerce website, yet it doesn’t always receive the amount of attention it deserves.
Why product search is so hard?
To put it simply, ecommerce search is mapping user’s intent to the best product or a collection of products that satisfies the intent, so if a user wants to buy a “beige cotton shirtdress without any stripes”, then the search result should include one to three products that best represent the description of a desired clothes. So, what can go wrong with such a query in a typical search?
Getting the meaning of words
The most obvious problem is, well, the semantics of words. First of all, the system must identify that a “shirtdress” is what we’re primarily interested in. But what if there is no notion of a “shirtdress” in the system? Furthermore, can the system understand that, for instance, “beige” is a colour? And, even then, can it also take into account that it is similar to “khaki” or “bleached yellow”? Understanding the meaning of words, the concept behind them and their relationship with each other is something that a simple lexical search is not capable of. A straightforward way to deal with this problem is to categorize all words that may be used by the user into categories (e.g. “green”, “white”, “yellow” – “colour”), and group synonyms within these categories (“lime”, “mint” – “green”). This is called Ontology Engineering. One of the biggest English ontologies is WordNet, containing 155 000+ words organized in more than 175 000 synonym groups called sysnets. WordNet is published under an open-source licence and has interfaces to many major programming languages.
Dealing with negation
So, it seems that it is enough to extract the words, maybe identify typos and normalize them to their base form, and then just match them with the categories and synonyms that we have in our database. But let’s return to our query: you may notice that it ends on “without stripes”. If we process it with our method, we would probably identify “stripes” as a feature or pattern and run into a problem that many search engines have – returning the exact opposite of the desired results for a query with negation. This is because negation is quite hard to deal with for a computer. It can be expressed in many ways (“without stripes”, “no stripes”, “stripeless”) each relating differently to other words (“no” might negate one or more consecutive words, while “-less” negates only the word to which it is attached). In order to deal with negation, we have to take into account not only the meaning of words (including the negation words), but also the order in which the words are placed. This problem has recently seen tremendous progress due to the rise of Deep Learning and Recurrent Neural Networks in particular. While such systems achieve very high accuracy, they are usually too costly to be used in high throughput search engines as they require significantly more time and computational power to process a single query.
Okay, so we determined the meaning of all the words in a query, now what? How do we actually get the most relevant products while leaving out all of the irrelevant ones? Furthermore, how to rank them? The problem is usually called the precision-recall tradeoff. You can think of precision as a signal-to-noise ratio, that is how many of the presented products are actually relevant. Recall, on the other hand, is a measure of how well we “recalled” the relevant products from our catalog, that is, how many of the relevant products (in our catalog) were presented to the user. You can easily see why it’s a trade-off: if we present only one best matching product, we’ll have maximum precision, but because we left out a lot of relevant options, the recall is low; in contrast, if we show the whole catalog to the user, it will surely include all the relevant products, so the recall would be maximum, but the results would contain a lot of noise, so precision would be low. Obviously, we want a good balance between the two, so how do we rank the products? One such method is called tf-idf (term frequency - inverse document frequency), which is a weighting scheme that allows us to determine how relevant a term (word) is in a document (product information) from a corpus (product catalog). It is frequently used as a central tool in ranking relevance based on a user query. More generally, the problem of ranking items is called Preference Learning and it may employ much more sophisticated machine learning algorithms.
Those are just a few of the issues addressed by semantic search. It is easy to see why the task of providing relevant search results is so hard. If we add other factors, such as personal data of the user (i.e. personalized search), the problem becomes even more complex.
Are there other ways of smart search?
While typing a query with a keyboard is still the main way people search for products, other ways are starting to emerge. According to Google, 20% of searches on Android Google app are done by voice, and voice search is getting used in eCommerce more and more. So, a natural question is: do the aforementioned considerations and solutions apply? In fact, they do! Voice search is reduced to “text-based” search by performing speech recognition which outputs the recognised words. Moreover, semantic search is even more important when dealing with voice queries, since they tend to be put in a more “natural” language (as people would ask from a real person) and thus are longer and more complicated to process.
Another recent addition to the way people search is using their cameras – image search is starting to get used in many new interesting scenarios. For example, not long ago Pinterest has introduced Lens, a visual discovery tool that lets you take a photo and search for similar content available on Pinterest. Sure enough, eCommerce is not falling behind – a similar service, Find It On eBay, uses image search to find where to buy stuff displayed on a picture taken by a user. Needless to say, such functionality is built on an entirely different set of machine learning tools, namely, object detection and convolutional neural networks. Constructing such systems requires a lot of data and compute power, nevertheless, both are becoming more and more accessible, making it feasible for smaller companies to implement these systems as well. While the technology is still in its early stage, it has the potential to be hugely useful for the customers, so it’s worth keeping an eye on it.
Figuring out exactly what your users want to find is a complex problem, and in this day and age, it is not enough to perform a simple word-matching filtering to make the customers happy. Semantic search tackles this problem by analysing the meaning of words and extracting the original intent of a user’s query. Yet, providing relevant results not only requires understanding the user’s intent, but also matching it with the content of products in your catalog. Using voice and pictures as an input for the search is getting used more and more among eCommerce shoppers, however, these types of search are yet to be adopted by the industry en masse.