Enterprise Data Management (EDM) Deep Dive
Welcome to a new blog series focused on recent - and perhaps overlooked and less understood - features and enhancements introduced in Oracle Enterprise Data Management (EDM). This will help clarify and point out the nuances and not-so-obvious enhancements now available with EDM.
Part 1 of this blog focuses on… (drum roll) - the EDM search engine. You may be thinking, what’s to understand about EDM search, you big silly? You type stuff in and search for it, right? Well, not so fast, my eager friend. Due to recent changes in the underlying EDM search engine, it works a bit differently than you might expect. Until you understand these intricacies, you might find yourself frustrated in how the EDM search feature works.
Before I continue, many thanks to Anurag Garg, Senior Director of Engineering at Oracle for Enterprise Data Management, for providing background information on this topic. Any errors or omissions in this blog are purely my mistakes, not his.
The Switch to Lucene
The change in EDM search functionality started a few months ago with the introduction of a new search framework called Lucene. Lucene (officially known as Apache Lucene) is an open-source search library software that has been around for about 20 years. More details on Lucene are available at https://lucene.apache.org/
So why the switch to Lucene? Previously, EDM performed in-memory, index-based searches. While that approach is still used for certain scenarios, more complex searches using this algorithm were draining that precious resource we fondly refer to as memory.
With Lucene, EDM searches are now faster and more efficient due to less reliance on in-memory searches. Even more important, the Lucene framework sets the stage for potential upcoming EDM search enhancements such as:
- Scalability - allows EDM to efficiently search large dimensions with hundreds of thousands of nodes or more.
- Property Searches – all I can say is, hallelujah! This is a much-needed feature.
- Fuzzy Searches - while you might think this means EDM will more quickly find your favorite teddy bear, it really means that EDM will be more effective and efficient in performing searches with approximate search phrases.
Future expansion of EDM search functionality would have been constrained based on the original, in-memory index approach, so voila – we have Lucene.
So why Lucene over other search products? As an admitted noob about open-source search engines, I did some Google searches on the topic (Get it?! I used Google search to search for information on search engines!). I discovered that indeed, many other open-source search frameworks exist – Elastisearch, Solr, and OpenSearchServer, to name a few. These are considered the heavyweights of the search world, which offer full-fledged web applications, polished User Interfaces, and tons of functionality (along with increased consumption of memory and CPU). But EDM didn’t need those extra bells and whistles and the extra resource consumption they would require. EDM just needed a solid and robust search API which Lucene provided. Interesting trivia to mention at your next cocktail party – many of the popular search engines mentioned above are based on Lucene.
So now that you have some background, I’ll share a few tips I learned about how the search feature works in EDM. The big “aha” moment for me was realizing that your search string is parsed and tokenized into individual tokens based on spaces. What does that mean? It means a search string of “1000 – Petty Cash” is tokenized into four separate strings:
- – (the dash)
Not only that, the second “aha” moment is realizing EDM will perform an “OR” or “ANY” operation among those search tokens, meaning any EDM node containing any of the search tokens in the member name or member description will be returned. The search results are capped at 50 results.
In my example above, any node containing “1000” or “-“ or “Petty” or “Cash” would be returned. Clearly, this is not going to return the results you want in many cases, especially when searching on descriptions or when your member names contain spaces.
The most important advice I can give at this point is to keep this algorithm in mind and adjust your search phrases accordingly. “Less is more” is often the theme here, where using fewer words in your search string, or using unique/less common words, are more likely to return what you are looking for.
To be honest, the current search algorithm can be frustrating. It is fantastic when member names do not contain spaces or when you are searching for a unique (less common) search phrase. I like how EDM will bold the pieces of the member name/description in the search results that match your search phrase.
There are legitimate business situations where more of an “and” or “exact match” type of search is necessary. Thankfully, Oracle is listening and exploring various enhancements (note: safe harbor applies, of course, in terms of direction and timing of search enhancements). One option being explored is to utilize a feature of Lucene that allows search results to be “scored,” where the higher scoring results will bubble to the top of the search results. For this situation, it would be advantageous to include more words in your search phrase so that nodes with a higher number of hits will be scored accordingly and show up first.
Another option under consideration is allowing the user to select search clauses (e.g. AND, OR, EXACT MATCH) to force the type of search result s/he is seeking.
So, there you have it: a brief recap on what I have learned about search APIs, open-source search libraries, how the EDM search function currently works, and enhancements we may see down the road. Stay tuned, and as I learn and hear more, I will be sure to share it.
Check back soon for the next post in the EDM Deep Dive series!
For comments, questions, or suggestions for future topics, please reach out to us at email@example.com. Visit our blog regularly for new posts about Cloud updates and other Oracle Cloud Services such as Planning and Budgeting, Financial Consolidation, Account Reconciliation, and Enterprise Data Management. Follow Alithya on social media for the latest information about EPM, ERP, and Analytics solutions to meet your business needs.