Deep Partitioning in Hive

Enviado por

arjuncchaudhary

0% acharam este documento útil (0 voto)

16 visualizações2 páginas

Direitos autorais

Formatos disponíveis

DOCX, PDF, TXT ou leia online no Scribd

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Denunciar este documento

Deep Partitioning in Hive

Direitos autorais:

Formatos disponíveis

Baixe no formato DOCX, PDF, TXT ou leia online no Scribd

Sinalizar o conteúdo como inadequado

0% acharam este documento útil (0 voto)

16 visualizações2 páginas

Deep Partitioning in Hive

Enviado por

arjuncchaudhary

Deep Partitioning in Hive

Direitos autorais:

Formatos disponíveis

Baixe no formato DOCX, PDF, TXT ou leia online no Scribd

Sinalizar o conteúdo como inadequado

Pular para a página

Você está na página 1de 2

Pesquisar no documento

Deep partitioning in Hive (Why and how?

There are many situations where we need to update a record in Hive and the only option
available is to overwrite the complete table or a partition.

Let's consider a use case - We are importing many tables from an OLTP system into hadoop
cluster with sqoop. Later we want to keep a ledger table for source OLTP count and imported
table count in HDFS to maintain balance between source and target.

These tables are loaded into HDFS by Sqoop at different times in the data pipeline. If we
partition the table by date, we might have to find the row count of all the tables as a batch
process after the daily load completes. There will be a delay between the actual loading and
ledger update.

What happens if this table is partitioned by date, table_group, table_name and system. Then
each granular partition will have only one row. Meaning we can mimic update in Hive by
inserting a file which has the row count and as and when the table got loaded.

Implementation:

CREATEEXTERNALTABLE`count_ledger`(
`count`string,
`ts`string)
PARTITIONED BY (
`date`string,
`table_group`string,
`table_name`string,
`system`string
)
ROWFORMATDELIMITED
FIELDSTERMINATEDBY','
LINESTERMINATEDBY'\n'
LOCATION
'/data/count_ledger';
HDFS put to external directory for "OLTP count" and "Hive count"
hadoop fs -put /home/OLTP_booking_count.dat
/data/count_ledger/dt=${fromdate}/table_group=sales/table=bookings/system=OLTP/

hadoop fs -put /home/Hive_booking_count.dat

/data/count_ledger/dt=${fromdate}/table_group=sales/table=bookings/system=Hive/

MSCK repair for updating partitions automatically: msck repairtable count_ledger;

Note: We created the directory structure of HDFS similar to the partition, which is essential for
MSCK to work properly. The big benefit is that we don't have to issue alter commands to update
partitions.

HDFS directory: /table_group=sales/table=bookings/system=Hive/

Resembles Hive table partition:

PARTITIONED BY (
`date`string,
`table_group`string,
`table_name`string,
`system`string)

Just run MSCK anytime to keep the partitions up to date with HDFS structure.

Mimicking Update: In this design we can update the count of any table at any time (by HDFS
overwrite) and the data will be available for querying or Insert record by HDFS put.

Performance: The query performance will be improved since most of the information
(partitions) will be obtained from Meta store itself. Have has to display the content of one
record file.

selectcountfrom count_ledger
wheredate = '2016-09-01'and table_group = 'sales'and table_name = 'booking'andsystem =
'Hive';

Cons: Since this design will create numerous partitions and potential impact to namenode, we
can use this idea only for status tables or smaller aggregate tables (like country level summary)

Você também pode gostar

Fear: Trump in the White House
No Everand
Fear: Trump in the White House
Bob Woodward
Nota: 3.5 de 5 estrelas
3.5/5 (738)
A Man Called Ove: A Novel
No Everand
A Man Called Ove: A Novel
Fredrik Backman
Nota: 4.5 de 5 estrelas
4.5/5 (4609)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
No Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Nota: 4.5 de 5 estrelas
4.5/5 (119)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
No Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Nota: 4.5 de 5 estrelas
4.5/5 (265)
The Little Book of Hygge: Danish Secrets to Happy Living
No Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Nota: 3.5 de 5 estrelas
3.5/5 (399)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
No Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Nota: 3.5 de 5 estrelas
3.5/5 (231)
Grit: The Power of Passion and Perseverance
No Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Nota: 4 de 5 estrelas
4/5 (587)
Yes Please
No Everand
Yes Please
Amy Poehler
Nota: 4 de 5 estrelas
4/5 (1891)
Never Split the Difference: Negotiating As If Your Life Depended On It
No Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Nota: 4.5 de 5 estrelas
4.5/5 (838)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
No Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Nota: 4 de 5 estrelas
4/5 (5794)
Rise of ISIS: A Threat We Can't Ignore
No Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Nota: 3.5 de 5 estrelas
3.5/5 (137)
Team of Rivals: The Political Genius of Abraham Lincoln
No Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Nota: 4.5 de 5 estrelas
4.5/5 (234)
Shoe Dog: A Memoir by the Creator of Nike
No Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Nota: 4.5 de 5 estrelas
4.5/5 (537)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
No Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Nota: 3.5 de 5 estrelas
3.5/5 (2219)
Principles: Life and Work
No Everand
Principles: Life and Work
Ray Dalio
Nota: 4 de 5 estrelas
4/5 (599)
The Emperor of All Maladies: A Biography of Cancer
No Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Nota: 4.5 de 5 estrelas
4.5/5 (271)
John Adams
No Everand
John Adams
David McCullough
Nota: 4.5 de 5 estrelas
4.5/5 (2409)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
No Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Nota: 4 de 5 estrelas
4/5 (1090)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
No Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Nota: 4.5 de 5 estrelas
4.5/5 (344)
Her Body and Other Parties: Stories
No Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Nota: 4 de 5 estrelas
4/5 (821)
Wolf Hall: A Novel
No Everand
Wolf Hall: A Novel
Hilary Mantel
Nota: 4 de 5 estrelas
4/5 (3811)
The Glass Castle: A Memoir
No Everand
The Glass Castle: A Memoir
Jeannette Walls
Nota: 4.5 de 5 estrelas
4.5/5 (1712)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
No Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Nota: 4 de 5 estrelas
4/5 (894)
Sing, Unburied, Sing: A Novel
No Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Nota: 4 de 5 estrelas
4/5 (1103)
The Outsider: A Novel
No Everand
The Outsider: A Novel
Stephen King
Nota: 4 de 5 estrelas
4/5 (1839)
A Tree Grows in Brooklyn
No Everand
A Tree Grows in Brooklyn
Betty Smith
Nota: 4.5 de 5 estrelas
4.5/5 (1929)
The Woman in Cabin 10
No Everand
The Woman in Cabin 10
Ruth Ware
Nota: 3.5 de 5 estrelas
3.5/5 (2322)
Angela's Ashes: A Memoir
No Everand
Angela's Ashes: A Memoir
Frank McCourt
Nota: 4.5 de 5 estrelas
4.5/5 (440)
Little Women
No Everand
Little Women
Louisa May Alcott
Nota: 4 de 5 estrelas
4/5 (104)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
No Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Nota: 4.5 de 5 estrelas
4.5/5 (474)
The Art of Racing in the Rain: A Novel
No Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Nota: 4 de 5 estrelas
4/5 (4200)
The Light Between Oceans: A Novel
No Everand
The Light Between Oceans: A Novel
M.L. Stedman
Nota: 4.5 de 5 estrelas
4.5/5 (789)
The Unwinding: An Inner History of the New America
No Everand
The Unwinding: An Inner History of the New America
George Packer
Nota: 4 de 5 estrelas
4/5 (45)
The Yellow House: A Memoir (2019 National Book Award Winner)
No Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Nota: 4 de 5 estrelas
4/5 (98)
The Perks of Being a Wallflower
No Everand
The Perks of Being a Wallflower
Stephen Chbosky
Nota: 4.5 de 5 estrelas
4.5/5 (2099)
On Fire: The (Burning) Case for a Green New Deal
No Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Nota: 4 de 5 estrelas
4/5 (73)
The Constant Gardener: A Novel
No Everand
The Constant Gardener: A Novel
John le Carre
Nota: 3.5 de 5 estrelas
3.5/5 (104)
ProjectCostControlTools and Techniques
Documento26 páginas
ProjectCostControlTools and Techniques
ahsana29
100% (3)
Manhattan Beach: A Novel
No Everand
Manhattan Beach: A Novel
Jennifer Egan
Nota: 3.5 de 5 estrelas
3.5/5 (792)
Brooklyn: A Novel
No Everand
Brooklyn: A Novel
Colm Toibin
Nota: 3.5 de 5 estrelas
3.5/5 (1937)
Dgi Framework
Documento20 páginas
Dgi Framework
patricio
100% (1)
Maintenance: ASU-600 Series
Documento54 páginas
Maintenance: ASU-600 Series
Michael Maluenda Castillo
100% (2)
Bad Feminist: Essays
No Everand
Bad Feminist: Essays
Roxane Gay
Nota: 4 de 5 estrelas
4/5 (1015)
Steve Jobs
No Everand
Steve Jobs
Walter Isaacson
Nota: 4.5 de 5 estrelas
4.5/5 (806)
Data Warehouse Road Map
Documento31 páginas
Data Warehouse Road Map
arjuncchaudhary
Ainda não há avaliações
Business Advantage Pers Study Book Intermediate PDF
Documento98 páginas
Business Advantage Pers Study Book Intermediate PDF
Cool Nigga
100% (1)
Leyte Geothermal v. PNOC
Documento3 páginas
Leyte Geothermal v. PNOC
Allen Windel Bernabe
Ainda não há avaliações
PB Engine Kappa Eng
Documento20 páginas
PB Engine Kappa Eng
Oscar Araque
Ainda não há avaliações
UBI1
Documento66 páginas
UBI1
Rudra Singh
Ainda não há avaliações
Huawei 9000a
Documento27 páginas
Huawei 9000a
AristideKonan
Ainda não há avaliações
3 0 Visual Weld Inspector
Documento74 páginas
3 0 Visual Weld Inspector
Vincent Sofia Raphael
Ainda não há avaliações
0542 (MDM) HubJavaUserExits en H2L
Documento18 páginas
0542 (MDM) HubJavaUserExits en H2L
arjuncchaudhary
Ainda não há avaliações
Migrating IDD Applications To The Business Entity Data Model
Documento24 páginas
Migrating IDD Applications To The Business Entity Data Model
arjuncchaudhary
Ainda não há avaliações
Apache Oozie - A workflow scheduler to manage Hadoop jobs
Documento5 páginas
Apache Oozie - A workflow scheduler to manage Hadoop jobs
arjuncchaudhary
Ainda não há avaliações
Apache Flume Advanced
Documento17 páginas
Apache Flume Advanced
arjuncchaudhary
Ainda não há avaliações
MDM 960 Sif
Documento150 páginas
MDM 960 Sif
arjuncchaudhary
Ainda não há avaliações
Sanjiv Ips Comments
Documento7 páginas
Sanjiv Ips Comments
arjuncchaudhary
Ainda não há avaliações
Internet of Things1
Documento7 páginas
Internet of Things1
arjuncchaudhary
Ainda não há avaliações
Diary of Wimpy Kid Book 8 Hard Luck PDF
Documento213 páginas
Diary of Wimpy Kid Book 8 Hard Luck PDF
kishore kumar kota
Ainda não há avaliações
Analytics and BigData Usecases in Communication
Documento3 páginas
Analytics and BigData Usecases in Communication
arjuncchaudhary
Ainda não há avaliações
BigData MapReduce
Documento6 páginas
BigData MapReduce
arjuncchaudhary
100% (1)
Big Data - Impala
Documento5 páginas
Big Data - Impala
arjuncchaudhary
Ainda não há avaliações
Big Data - Data Migration
Documento5 páginas
Big Data - Data Migration
arjuncchaudhary
Ainda não há avaliações
School Policies
Documento8 páginas
School Policies
arjuncchaudhary
Ainda não há avaliações
Tuning Mappings For Better Performance
Documento10 páginas
Tuning Mappings For Better Performance
kodanda
Ainda não há avaliações
Learning To Program With Python
Documento283 páginas
Learning To Program With Python
alkaline123
100% (4)
India January 2017 - December 2017
Documento12 páginas
India January 2017 - December 2017
arjuncchaudhary
Ainda não há avaliações
Project Minds Quick Guide To Project Management
Documento25 páginas
Project Minds Quick Guide To Project Management
sreedhar08
Ainda não há avaliações
StudyGuides PMstudy Project Scope Management
Documento19 páginas
StudyGuides PMstudy Project Scope Management
rlatorre
Ainda não há avaliações
245 2007
Documento10 páginas
245 2007
arjuncchaudhary
Ainda não há avaliações
Art 3 Full
Documento20 páginas
Art 3 Full
arjuncchaudhary
Ainda não há avaliações
Data Model Patterns
Documento25 páginas
Data Model Patterns
peterche
Ainda não há avaliações
Hari Gita
Documento21 páginas
Hari Gita
Yogesh
Ainda não há avaliações
Business Intelligence Strategy Implementation and Execution BI
Documento39 páginas
Business Intelligence Strategy Implementation and Execution BI
arjuncchaudhary
Ainda não há avaliações
MX (En-Mx) BigData CIOrt John Martelli
Documento19 páginas
MX (En-Mx) BigData CIOrt John Martelli
arjuncchaudhary
Ainda não há avaliações
Bi Stratplan May2005
Documento24 páginas
Bi Stratplan May2005
arjuncchaudhary
Ainda não há avaliações
Bi Strategy WP
Documento58 páginas
Bi Strategy WP
arjuncchaudhary
Ainda não há avaliações
VB 2
Documento11 páginas
VB 2
Sudhir Ikke
Ainda não há avaliações
Why Companies Choose Corporate Bonds Over Bank Loans
Documento31 páginas
Why Companies Choose Corporate Bonds Over Bank Loans
তোফায়েল আহমেদ
Ainda não há avaliações
Leapfroggers, People Who Start A Company, Manage Its Growth Until They Get Bored, and Then Sell
Documento3 páginas
Leapfroggers, People Who Start A Company, Manage Its Growth Until They Get Bored, and Then Sell
ayesha noor
Ainda não há avaliações
Technical Skills:: Surabhi Srivastava
Documento3 páginas
Technical Skills:: Surabhi Srivastava
Prasad Joshi
Ainda não há avaliações
Shubh Am
Documento2 páginas
Shubh Am
Chhotu
Ainda não há avaliações
Presentation of The Lord
Documento1 página
Presentation of The Lord
Sarah Jones
Ainda não há avaliações
PDF Problemas Ishikawa - Free Download PDF - Reporte PDF
Documento2 páginas
PDF Problemas Ishikawa - Free Download PDF - Reporte PDF
NewtoniX
Ainda não há avaliações
Kuliah Statistik Inferensial Ke4: Simple Linear Regression
Documento74 páginas
Kuliah Statistik Inferensial Ke4: Simple Linear Regression
vivian indriokta
Ainda não há avaliações
10 Appendix RS Means Assemblies Cost Estimation
Documento12 páginas
10 Appendix RS Means Assemblies Cost Estimation
shahbazi.amir15
Ainda não há avaliações
RCA - Mechanical - Seal - 1684971197 2
Documento20 páginas
RCA - Mechanical - Seal - 1684971197 2
Hungphamphi
Ainda não há avaliações
Proprietar Utilizator Nr. Crt. Numar Inmatriculare Functie Utilizator Categorie Autovehicul
Documento3 páginas
Proprietar Utilizator Nr. Crt. Numar Inmatriculare Functie Utilizator Categorie Autovehicul
transpol2023
Ainda não há avaliações
Company's Profile Presentation (Mauritius Commercial Bank)
Documento23 páginas
Company's Profile Presentation (Mauritius Commercial Bank)
ashairways
100% (2)
S0231689H02-B01-0001 Rev 02 Code 1 General Arrangement Drawing For 44 Kva Diesel Generator PDF
Documento6 páginas
S0231689H02-B01-0001 Rev 02 Code 1 General Arrangement Drawing For 44 Kva Diesel Generator PDF
Anonymous AfjzJdn
Ainda não há avaliações
RS-RA-N01-AL User Manual of Photoelectric Total Solar Radiation Transmitter
Documento11 páginas
RS-RA-N01-AL User Manual of Photoelectric Total Solar Radiation Transmitter
mohamad
Ainda não há avaliações
Stage 1 Isolating Boiler Feed Pump 1. Purpose
Documento3 páginas
Stage 1 Isolating Boiler Feed Pump 1. Purpose
joseph kamwendo
Ainda não há avaliações
Exam Venue For Monday Sep 25, 2023 - 12-00 To 01-00
Documento7 páginas
Exam Venue For Monday Sep 25, 2023 - 12-00 To 01-00
naveed hassan
Ainda não há avaliações
10 Consulting Frameworks To Learn For Case Interview - MConsultingPrep
Documento25 páginas
10 Consulting Frameworks To Learn For Case Interview - MConsultingPrep
Tushar Kumar
Ainda não há avaliações
Responsibility Centres: Nature of Responsibility Centers
Documento13 páginas
Responsibility Centres: Nature of Responsibility Centers
mahesh19689
Ainda não há avaliações
Jodi Ridgeway vs. Horry County Police Department
Documento17 páginas
Jodi Ridgeway vs. Horry County Police Department
WMBF News
Ainda não há avaliações
HW3
Documento3 páginas
HW3
Noviyanti Tri Maretta Sagala
0% (1)
JRC Wind Energy Status Report 2016 Edition
Documento62 páginas
JRC Wind Energy Status Report 2016 Edition
Byambaa Battulga
Ainda não há avaliações
De Thi Thu THPT Quoc Gia Mon Tieng Anh Truong THPT Hai An Hai Phong Nam 2015
Documento10 páginas
De Thi Thu THPT Quoc Gia Mon Tieng Anh Truong THPT Hai An Hai Phong Nam 2015
nguyen nga
Ainda não há avaliações
Nuxeo Platform 5.6 UserGuide
Documento255 páginas
Nuxeo Platform 5.6 UserGuide
Patrick McCourt
Ainda não há avaliações