Analytics and In-Memory Databases Are Changing Data Centers

The place for technology related posts.

Moderator: Moderators

Post Reply
User avatar
Sabre
DCAWD Founding Member
Posts: 21432
Joined: Wed Aug 11, 2004 8:00 pm
Location: Springfield, VA
Contact:

Analytics and In-Memory Databases Are Changing Data Centers

Post by Sabre »

ECRM Guide article
Power and Cooling, Memory Needs Drive Change

The trend toward in-memory analytics is being driven in part by power and cooling and memory demands.
Power and cooling needs expand as more and more servers are required as the database grows, as power and cooling requirements grow with the server count. Larger and larger amounts of memory are also needed, given the number of cores and the available processing power for each CPU. And the more memory you require, the more power and cooling you need. The number of watts for DDR3 with even 4 DIMM slots can exceed the number of watts required by a single processor, depending on the processor. Using more than 4 DIMM slots per processor is becoming more common, and of course the core count is growing with no end in sight. If you make the assumption that you need 2 GB of memory per core for efficient analytics processing, a 12 core CPU will need 24 GB of memory. The number of flash drives or PCIe flash devices needed to achieve the performance of main memory is not cost-effective, given the number of PCIe buses needed, the cost of the devices and the complexity of using them compared to just using memory.

I/O Complexity Slows the Data Path

Another reason that many data analytics programs have moved to memory-only methods is because of I/O complexity and latency.
Historically I/O has been thousands of times slower than memory in both latency and performance. Even with flash devices, I/O latency is milliseconds of latency compared to microseconds for main memory. Even if flash device latency improves, it still has to go through the OS and PCIe bus compared to a direct hardware translation to address memory.
Knowledge of how to do I/O efficiently is limited because I/O programming is not taught in schools. Other than using the C library fopen/fread/fwrite, not much more is taught from what I have seen, and even if high-performance, low-latency I/O programming were taught, there are still significant limits to performance because of minimal interfaces.
The cost of I/O in terms of operating system interrupt overhead, latency and the path through the I/O stack is another limitation. Whenever I/O is done, the operating system must be called to do the I/O. This has significant overhead and latency and cannot be eliminated given current operating system implementations.
The problem is that the I/O stack has not changed much at all in 30 years. This is what the data path looks like currently:

Image

There are no major changes on the horizon for the I/O stack, which means that any application still has to go through interrupting the operating system, the file system POSIX layer and the SCSI/SATA driver. There are some flash PCIe vendors that have developed changes to the I/O stack, but they are proprietary. I see nothing on the standards horizon that looks like a proposal, much less something that all of the vendors can agree upon. The standards process is controlled by a myriad of different groups, so I have little hope of change, which is why storage will be relegated to checkpointing and restarting these in-memory applications. It is clear that data analytics cannot be efficiently accomplished using disk drives, even flash drives, as you will have expensive CPUs sitting idle.

Future Analytics Architectures

As data analytics demand more and more memory because of the increase in CPU core counts, new memory technologies will need to be developed and brought to market to address the requirements. Things like double-stacked DDR3, phase change memory (PCM), memristor and other technologies are going to be required to meet the needs of this market. Data analytics is a memory-intensive application, and even high-performance storage does not have enough bandwidth to address the requirements. Combine that with the fact that data analytics applications are latency intolerant and you have a memory-based application, as the storage stack is high latency and OS hungry compared to memory, even slower memory such as PCM or memristor, and this latency cannot be changed.

Interesting article... not something we all probably didn't see happening though.
Sabre (Julian)
Image
92.5% Stock 04 STI
Good choice putting $4,000 rims on your 1990 Honda Civic. That's like Betty White going out and getting her tits done.
User avatar
complacent
DCAWD Founding Member
Posts: 11651
Joined: Sun Aug 29, 2004 8:00 pm
Location: near the rockies. very.
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by complacent »

everything has to evolve, right? 8)
colin

a tank, a yammie, a spaceship
i <3 teh 00ntz
User avatar
PGT
DCAWD Groupie
Posts: 1578
Joined: Mon Jun 04, 2007 11:06 am
Location: Loudoun

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by PGT »

I work with arguably the (one of the) top guy in memory analytics in the country (if not the world). I'll have to shoot this to him.
2013 BMW 328i M Sport with 8sp in Estoril Blue II
2012 Chrysler 300C SRT8 - Always bet on black
2012 Jeep Wrangler Unlimited Rubicon Call of Duty: Modern Warfare 3 Edition, otherwise known as the MW3 (and badass)
User avatar
Sabre
DCAWD Founding Member
Posts: 21432
Joined: Wed Aug 11, 2004 8:00 pm
Location: Springfield, VA
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by Sabre »

Keep in mind they are talking about something a little different here... but he will still probably find it interesting :)
Sabre (Julian)
Image
92.5% Stock 04 STI
Good choice putting $4,000 rims on your 1990 Honda Civic. That's like Betty White going out and getting her tits done.
User avatar
PGT
DCAWD Groupie
Posts: 1578
Joined: Mon Jun 04, 2007 11:06 am
Location: Loudoun

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by PGT »

different how?
2013 BMW 328i M Sport with 8sp in Estoril Blue II
2012 Chrysler 300C SRT8 - Always bet on black
2012 Jeep Wrangler Unlimited Rubicon Call of Duty: Modern Warfare 3 Edition, otherwise known as the MW3 (and badass)
User avatar
Sabre
DCAWD Founding Member
Posts: 21432
Joined: Wed Aug 11, 2004 8:00 pm
Location: Springfield, VA
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by Sabre »

Doesn't your guy deal with pulling the bits off of the sticks of RAM for analysis after an attack on a server? The above is talking about using more RAM to do analytics because it's faster.
Sabre (Julian)
Image
92.5% Stock 04 STI
Good choice putting $4,000 rims on your 1990 Honda Civic. That's like Betty White going out and getting her tits done.
User avatar
PGT
DCAWD Groupie
Posts: 1578
Joined: Mon Jun 04, 2007 11:06 am
Location: Loudoun

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by PGT »

we do virtualized computing (which is inherently is memory intensive) - he's doing realtime analytics of that memory to protect the hypervisor from cloud-bursting exploits.
2013 BMW 328i M Sport with 8sp in Estoril Blue II
2012 Chrysler 300C SRT8 - Always bet on black
2012 Jeep Wrangler Unlimited Rubicon Call of Duty: Modern Warfare 3 Edition, otherwise known as the MW3 (and badass)
User avatar
Sabre
DCAWD Founding Member
Posts: 21432
Joined: Wed Aug 11, 2004 8:00 pm
Location: Springfield, VA
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by Sabre »

Ah, gotcha :)
Sabre (Julian)
Image
92.5% Stock 04 STI
Good choice putting $4,000 rims on your 1990 Honda Civic. That's like Betty White going out and getting her tits done.
User avatar
PGT
DCAWD Groupie
Posts: 1578
Joined: Mon Jun 04, 2007 11:06 am
Location: Loudoun

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by PGT »

256GB on a blade? Yep....memory is where its at. :lol:

Image
2013 BMW 328i M Sport with 8sp in Estoril Blue II
2012 Chrysler 300C SRT8 - Always bet on black
2012 Jeep Wrangler Unlimited Rubicon Call of Duty: Modern Warfare 3 Edition, otherwise known as the MW3 (and badass)
User avatar
Raven
Mr. Underpowered or something
Posts: 1221
Joined: Thu Feb 18, 2010 12:46 pm
Location: Manasty

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by Raven »

Maaaaaan, everyone gets to play with cooler toys than me. :cry:
All my cars have drum brakes and are sub 200 hp, what am I doing with my life?
2013 Mazda 2
1994 Chevy S10 pickup
1985 Chevy Caprice (no fuel system)
User avatar
PGT
DCAWD Groupie
Posts: 1578
Joined: Mon Jun 04, 2007 11:06 am
Location: Loudoun

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by PGT »

oh, I don't play with it. I just help sell it. :nana:
2013 BMW 328i M Sport with 8sp in Estoril Blue II
2012 Chrysler 300C SRT8 - Always bet on black
2012 Jeep Wrangler Unlimited Rubicon Call of Duty: Modern Warfare 3 Edition, otherwise known as the MW3 (and badass)
steed77
I'm a n000b
Posts: 30
Joined: Mon Dec 06, 2010 2:30 pm
Location: NoVa

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by steed77 »

I have been just going over this the past few weeks.

Have you guys had a chance to play with these?
http://www.fusionio.com/

I checked out a one of the IO duals a few weeks ago... SEXY as anything I have messed with in several years..... stripe a pair and get 1.9M IOps/SEC. INSANE!!! Pack 6 or 8 i/O cards in a DL890 and you are talking several (5sih) TB rockin disk speed, back it with 8 sockets and 2TB of 1333RAM... ALL IN 1 8U box.

Also some of the aps are licensing out core-linar. So trying it figure out how to pack 4000 cores in a few racks, It's been a fun week.
03 SVT Lighting 488hp/560tq
05 Evo 8 505hp/411tq
05 4.8is X5
09 versa
User avatar
complacent
DCAWD Founding Member
Posts: 11651
Joined: Sun Aug 29, 2004 8:00 pm
Location: near the rockies. very.
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by complacent »

i've only ever read about their awesomeness... never had the chance first hand. :-(

sounds like a crazy amount of fun/potential.
colin

a tank, a yammie, a spaceship
i <3 teh 00ntz
User avatar
Sabre
DCAWD Founding Member
Posts: 21432
Joined: Wed Aug 11, 2004 8:00 pm
Location: Springfield, VA
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by Sabre »

steed77 wrote:I have been just going over this the past few weeks.

Have you guys had a chance to play with these?
http://www.fusionio.com/
...

Also some of the aps are licensing out core-linar. So trying it figure out how to pack 4000 cores in a few racks, It's been a fun week.
I'm going to have to check those out!

4000 cores in a few racks eh? hmm.. Quad AMD 6176se's would get you there in two racks. I'd be more inclined to go with something based on the X5680 though for raw speed.
Sabre (Julian)
Image
92.5% Stock 04 STI
Good choice putting $4,000 rims on your 1990 Honda Civic. That's like Betty White going out and getting her tits done.
steed77
I'm a n000b
Posts: 30
Joined: Mon Dec 06, 2010 2:30 pm
Location: NoVa

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by steed77 »

AMD is a no go here.... must be Intel.

I do have a 1500 core rack solution, but it will require special cooling provisions. Currently I am running ~780/rack. Also need some good size of RAM to support, so figure ~10-12TB per rack

.
03 SVT Lighting 488hp/560tq
05 Evo 8 505hp/411tq
05 4.8is X5
09 versa
User avatar
Sabre
DCAWD Founding Member
Posts: 21432
Joined: Wed Aug 11, 2004 8:00 pm
Location: Springfield, VA
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by Sabre »

Who's the solution from? Even using Supermicro's TwinBlade system I can only get to 1440cores/42u.
Sabre (Julian)
Image
92.5% Stock 04 STI
Good choice putting $4,000 rims on your 1990 Honda Civic. That's like Betty White going out and getting her tits done.
steed77
I'm a n000b
Posts: 30
Joined: Mon Dec 06, 2010 2:30 pm
Location: NoVa

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by steed77 »

I have been looking at some of the HP solutions

BL2x220 G6's will stack out with 4 c7000's with 1,536 cores, and 12TB Mem at 42U...
OR
DL2000 w/DL170e Nodes, at 42U 1,008 cores and 16TB of Mem.

This is with the current offerings. Also there is lots of room for Fusion i/o cards.. still not sure on that yet.


Still working on thermals thu...
03 SVT Lighting 488hp/560tq
05 Evo 8 505hp/411tq
05 4.8is X5
09 versa
User avatar
PGT
DCAWD Groupie
Posts: 1578
Joined: Mon Jun 04, 2007 11:06 am
Location: Loudoun

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by PGT »

Dave/Acquacow is an SE with Fusion-IO, by the way.
2013 BMW 328i M Sport with 8sp in Estoril Blue II
2012 Chrysler 300C SRT8 - Always bet on black
2012 Jeep Wrangler Unlimited Rubicon Call of Duty: Modern Warfare 3 Edition, otherwise known as the MW3 (and badass)
User avatar
complacent
DCAWD Founding Member
Posts: 11651
Joined: Sun Aug 29, 2004 8:00 pm
Location: near the rockies. very.
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by complacent »

steed77 wrote:I have been looking at some of the HP solutions

BL2x220 G6's will stack out with 4 c7000's with 1,536 cores, and 12TB Mem at 42U...
OR
DL2000 w/DL170e Nodes, at 42U 1,008 cores and 16TB of Mem.

This is with the current offerings. Also there is lots of room for Fusion i/o cards.. still not sure on that yet.


Still working on thermals thu...
we've had really good luck with both the dl2000's and the blades. the dl's are great to load up with disks for tasks you don't want stored on a san... been using them since beige G3's.

jesus herald christ a fully-loaded blade chassis is heavy! it takes a while to unpack the pallet. watch out for bad switches, especially if you opt for the cisco models. we've had a couple show up doa with a fried backplane. :-/
colin

a tank, a yammie, a spaceship
i <3 teh 00ntz
steed77
I'm a n000b
Posts: 30
Joined: Mon Dec 06, 2010 2:30 pm
Location: NoVa

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by steed77 »

complacent wrote:
steed77 wrote:I have been looking at some of the HP solutions

BL2x220 G6's will stack out with 4 c7000's with 1,536 cores, and 12TB Mem at 42U...
OR
DL2000 w/DL170e Nodes, at 42U 1,008 cores and 16TB of Mem.

This is with the current offerings. Also there is lots of room for Fusion i/o cards.. still not sure on that yet.


Still working on thermals thu...
we've had really good luck with both the dl2000's and the blades. the dl's are great to load up with disks for tasks you don't want stored on a san... been using them since beige G3's.

jesus herald christ a fully-loaded blade chassis is heavy! it takes a while to unpack the pallet. watch out for bad switches, especially if you opt for the cisco models. we've had a couple show up doa with a fried backplane. :-/
Yea I am leaning toward the dl2000's myself. Have to get it past mgn't but the $$$ will be a determining factor, sorta.

Heck yes.. I had to rack 8 of them a few weeks back. With 2 people and no blades, they upper parts of the rack were tricky. Also did not have a lift at this site... that day sucked.
I think fully loaded it's ~850 lbs. Been lucky so far with the Cisco switches. However that will soon end, since Cisco and HP are not playing well together. Sure wish the Nesus line would make it to the c7000, but That will not happen for a long time.

Side note: been totally blow away by the DL360 g7's... Almost fully loaded up; 192GB RAM, 2 procs, 8 10K 146GB drives, 4 onboard nics, 2 slots avail (1 1/2 height, 1 full height) and in a 1U form.

ok enough rambling.

thanks for the input guys.
03 SVT Lighting 488hp/560tq
05 Evo 8 505hp/411tq
05 4.8is X5
09 versa
User avatar
Sabre
DCAWD Founding Member
Posts: 21432
Joined: Wed Aug 11, 2004 8:00 pm
Location: Springfield, VA
Contact:

Re: Analytics and In-Memory Databases Are Changing Data Cent

Post by Sabre »

BTW, update to the servers for this project (which I presume you have already done).
Dell releases C6145:
The PowerEdge C6145 is a 2U server offering up to 96 cores of processing power, designed for HPC applications, video rendering, virtualization, and Electronic Design Automation (EDA) workloads requiring high core counts, memory density and expanded I/O capabilities. The C6145 features the AMD Opteron 6000, available in two independent 4-socket server nodes, allowing users to cale up to 96 cores and up to 1 terabyte of memory.
So 2016 cores, 42TB of RAM and a ton of HD space in 42U.
Sabre (Julian)
Image
92.5% Stock 04 STI
Good choice putting $4,000 rims on your 1990 Honda Civic. That's like Betty White going out and getting her tits done.
Post Reply