Autotrader: Enabling Data Science With Hadoop

Document technical information

Format pdf
Size 1.1 MB
First found Nov 13, 2015

Document content analysis

Category Also themed
Language
English
Type
not defined
Concepts
no text concepts found

Persons

Organizations

Places

Transcript

Enabling Data Science with Hadoop
A few words about AutoTrader
Largest on-line media firm focused on the
Automotive sector
Acquired KBB.com, vAuto, VINSolutions, HomeNet,
AuctionGenius and AIS
Provide media and software solutions to Car
Manufacturers & Dealers
2
What did Big Data look like at ATC?
So what’s wrong with this model?
Staging &
Ingestion
Mart
EDW
Mart like a nail
• If you only have a hammer, every problem looks
Weblogs & all
enterprise data
BI Tools
MPP Appliance – Sized for Overnight Processing
• MPP Appliances are hard to scale predictably
Nightly
Replication
Business Intelligence
Data Science
• Reporting demands Aggregation, Analysis resists Aggregation
• Most Analysis scenarios become IT projectsMart
EDW
Analytics Tools
Mart
• Time-to-Insights is in the order of weeks & months
MPP Appliance – Sized for Data Archival
• No intra-day insights
3
Where are we today?
What are the guiding principles?
Staging &
Ingestion
Mart
EDW
BI Tools
• Clearly differentiate between the GovernedMart
BI environment and
Weblogs & all
thedataSemi-governed
Science
enterprise
MPP ApplianceData
– Sized for
Overnightenvironment
Processing
• Process data as little as possible in the Hadoop cluster
Business Intelligence
Data Science
• Provide Analysts with the ability to ingest & integrate data, rapidly
Visualization
iterate their Staging
analysis
and
prepare
datasets
for Tool
Analytic Modeling
&
Ingestion
Weblogs & most
• Govern
enterprise data
Hive & HBase
Data Prep
Tool
Analytics Tools
Analytic Models as Enterprise Meta-data
Commodity Hadoop Cluster
• Ingest data near-real-time
4
What are the key shifts?
From
To
Time to Insights
Weeks - Months
Hours
Ingestion Frequency
Nightly
Real-time
How Biz value is realized
IT Project
Self-service
How the platform scales
About $2M / Appliance About $7K / Node
Cost / TB
About $2K
About $400
Analytics Tools
Legacy
Industry-leading
Role of IT
Build
Enable
5
What else is Hadoop enabling?
• Extend the life of our MPP BI Appliance
• Real-time Analytics – Understand & influence
Customer Decisions within seconds
• Shift the focus to how Analytics can help build the
next-generation of Products
6
Questions?
7

Similar documents

×

Report this document