Extending Collaborative Data Discovery
We've talked quite a bit over the past few months about trends we're seeing in the Data Discovery arena around Collaboration and Democratizing Data Discovery. It's not enough to build a great report or a clever dashboard full of unique and impactful insights if it takes forever for you and your team to get there.
And once you've climbed the mountaintop, how do you share what you've learned? How do you bring the rest of your team up to speed and let them build upon what you've found and form their own conclusions?
Over the past few weeks, we've put a lot of time and effort (leading to the slowing cadence of our posts, mea culpa) to some of the issues our customers are facing around collaboration, sharing and augmenting the data stored in their Data Discovery applications. Our focus has been answering the problem of: "Once I've learned something new, how can I act on it?". To be more specific, we've looked at Oracle Endeca Information Discovery and how users might be able to edit and augment their existing data.
Insight 1: I've found something interesting in my data. I want to highlight it and bring it to the fore.
In the following example, I've identified a few high-performing schools in the Chicago Public School system. By filtering for high 11th Grade ACT scores, Student Safety and Parental Involvement, I've identified 3 schools I'd like to target and track over the coming year:
I select the three schools and use a new Custom Action that we've developed to "tag" or "modify" these records. In this example, I'm going to add the value "5-Star" to the Rating attribute for these schools.
The tagging is completely free-form (in this case) and allows the user to update these records in the Endeca index on the fly. Upon clicking Save, the update is applied and available to all users of the application. This is a relatively simple example but giving users the ability to augment their data with custom attribution can have a huge impact. For example, you can encourage collaboration by having one team member sifting through warranty data to identify bogus claims (based on vehicle history, customer information, etc.) and through this "tagging" process, give the ability for all users to refine, search or slice and dice based on this attribute.
Insight 2: This data has been mis-categorized or needs to be updated
Following from the Chicago Public Schools example, I've identified a school that had an incorrect Parent Engagement score. Rather than load all the data in again from scratch, I want to let my users update the record on the fly and keep working.
I specify the attribute I want to update, specify the new value (or list of values) and click Save to update my Endeca index in a matter of seconds.
Insight 3: This data is either irrelevant, insecure or should not exist in my application
Recently, the City of Chicago decided to close 50 public schools as part of their most recent budget crunch. If I'm exploring data about the schools with an eye toward choosing the best one for my kids, I probably don't need to be including defunct schools in my analysis. I navigate to the schools that are being closed and use the Delete Action to remove them from my index.
In this case, we've deleted them permanently from Endeca by persisting the transaction to a dedicated store. However, whether these operations are permanently applied to an Endeca instance and never reloaded or temporarily is a decision we leave up to customers. We see this in the field all the time where certain "bad records" pop up (ex: sales amounts with negative dollar values), and end up skewing some of the key reporting metrics.
Although these are simple examples, you can start to see the massive applicability in play. In a follow-up post coming later this week, we'll revisit these scenarios and explore real-life discovery examples of how these techniques can be used from healthcare cohort analysis to secure enterprise search.