
The Data Technology team at MSCI is responsible for meeting the data requirements across various business areas, including Index, Analytics, and Sustainability. Our team collates data from multiple sources such as vendors (e.g., Bloomberg, Reuters), website acquisitions, and web scraping (e.g., financial news sites, company websites, exchange websites, filings). This data can be in structured or semi-structured formats. We normalize the data, perform quality checks, assign internal identifiers, and release it to downstream applications.
As data engineers, we build scalable systems to process data in various formats and volumes, ranging from megabytes to terabytes. Our systems perform quality checks, match data across various sources, and release it in multiple formats. We leverage the latest technologies, sources, and tools to process the data. Some of the exciting technologies we work with include Snowflake, Databricks, and Apache Spark.
Core Java, Spring Boot, Apache Spark, Spring Batch, Python. Exposure to sql databases like Oracle, Mysql, Microsoft Sql is a must. Any experience/knowledge/certification on Cloud technology preferrably Microsoft Azure or Google cloud platform is good to have. Exposures to non sql databases like Neo4j or Document database is again good to have.