Imhotep: Large Scale Analytics and Machine Learning at Indeed

This talk was held on 2014年 3月 26日 (水曜日) 19時00分

To scale the building of decision trees on large amounts of Indeed job search data, we created a system called Imhotep. In addition to being a crucial tool for building these machine learning models, Imhotep has proven to be applicable to many different analytics problems. The core of Imhotep is a distributed system that manages the parallel execution of queries across a set of time-sharded inverted indices.

This talk will cover Imhotep’s primitive operations that allow us to build decision trees, drill into data, build graphs, and even execute sql-like queries in IQL (Imhotep Query Language). We will also discuss what makes Imhotep fast, highly available, and fault tolerant.