Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Data Furnaces (nytimes.com)
102 points by sew on Nov 27, 2011 | hide | past | favorite | 50 comments


In distributed computing, which is the kind of computing that tends to go on in data centres, there are two types of problem - large amounts of CPU on relatively small amounts of data, or small amounts of CPU on a relatively large amount of data.

The former is generally found in scientific computing (say, protein folding studies, or SETI@Home). In these kinds of problems, it's acceptable to have a widely distributed system with significant latency between nodes, because even though it takes you a long time to move the data across to another node, it's offset by the time you save in having another CPU perform work on it.

The latter is generally what you find in the business and web applications that are generating most of the heat in data centres today. These are things like stats aggregation, building search indicies, or plain old DB seeks. In these cases, it often doesn't make sense to distribute the work out over a connection with high latency, since by the time you've done that, performed a relatively cheap CPU operation on it, and sent it back down the wire, it would have been quicker just to queue it up locally and let the same node deal with when it had some CPU slots spare.

In other words the higher the latency between nodes, the less efficient the entire system becomes, and the less economical horizontal scaling becomes. In addition, there's a ton of extra overhead to maintain concurrency when latency is high.

It could be great for some niche problems though - like CDN, P2P relays, or local backup for nearby machines.


Actually, compute intensive jobs are becoming more prevalant over time, and request-response systems such as computer vision, speech recognition, and search (which are quickly gaining ground) are embarrassingly parallel tasks that don't require low latency dense clusters.


No, most of these do. To build a search engine for example, you need to collect a massive amount of data (say, every web page on the internet), and perform a relatively trivial amount of computation on it (building a TF/IDF term index of each term, computing PageRank, etc.) to build your indexes.

While building a search index is computationally intensive, the number of CPU cycles exercised per GB of data is relatively low. So if you wanted to 'distribute' this problem by farming out the data across a high latency connection to have it processed on another node, and then have it returned to you would actually be slower in many cases than just processing locally on a decent machine.


it depends on the what is meant by a computer. i'd refer to it as a mini-cluster that can fit into a basement for heating in the context of this article (and soon enough , moore's law, that will be a desktop size). On that scale, all these tasks are embarassingly parallel. Exceptions to this are multiuser systems where users interactions matter, such as facebook, in which case a huge cluster is required. My argument is that more computing will devoted to the former (basically machine learning, and scientific/engineering optimization) than the latter (multi-user databases) over time, simply because humans can only enter so much info (all 7 billion of us) compared to what machines can gather and compute.


Can you contact me? I want to pick your brain.


sure, added my email to profile


Frequent physical component failures mean cloud/furnace engineers would have to knock on people's doors all the time.

"Hi, I'm from the Internet. Your furnace is beeping? Yeah, I've got a spare uhm.. heating element."


This sounds like too complicated a logistical problem to work properly. By the time something like this was ready to roll out, after doing initial experiments and then enough pilot projects to make sure it was sound, there would almost certainly be a better way to recycle all of that heat.

For example, you could convert it to electricity using sun free photovoltaics (http://web.mit.edu/newsoffice/2011/sun-free-photovoltaics-07...), and then store it in solid state batteries (http://en.wikipedia.org/wiki/Solid-state_battery) for distribution.

Cloud computing is about out-sourcing the pain of managing a server farm - Data Furnaces seems like an idea going in the opposite direction.


Sun free photovoltaics only work with high-quality heat - heat produced at around 800 degrees Celcius, typically by burning a hydrocarbon fuel [1]. They're not capable of gathering waste heat and converting it to electricity.

[1] I'm currently working with the lab from the research article above; if you re-read that article, it explicitly mentions fuel sources a few times.


I remember butane being mentioned in the article. I was more or less spitballing an idea on how future technology would make more sense. I don't think sun free PVs or solid state batteries are where they need to be yet to make this work, but you know more about it then me, maybe it will never be feasible.

Speaking of which, are you familiar with this work?:

http://www.geek.com/articles/geek-cetera/8-grams-of-thorium-... - it was another

Hacker News submission, so you may have seen it.

Have you approached him about using sun-free PVs instead of mini steam turbines? Or wouldn't the Thorium generate the heat quality you need?


I was wondering if the author would mention District Heating, which arguably makes way more sense for effectively using this waste heat, and he did:

"Many cities in Europe already have insulated pipes in place for centralized “district heating.” Heat generated by data centers is beginning to be distributed to neighboring homes and commercial buildings — in Helsinki, for example."

This is an infrastructure challenge, but many US cities already have district heating systems. New York's is the largest, but several other cities have it already. Wikipedia lists about 20, plus systems installed on university campuses. Surely some of these cities are good places for data centers that could be integrated into the network.

http://en.wikipedia.org/wiki/District_heating#United_States


Academica has a data center in central Helsinki where the excess heat goes to the heating network used for nearby apartment buildings.

In Finnish: http://www.abb.fi/cawp/seitp202/91553a3dd19688b8c12577350036...


Would it not be more practical to use the heat in a district heating system?

[1] http://en.wikipedia.org/wiki/District_heating


The problem with that is the infrastructure you need to have in place. It does have the advantage of solving security, maintenance and latency issues you have with data furnaces.


But you really should have that infrastructure around anyway -- in the city I live in, heating costs are a joke (I pay about 11% of what it cost my grandparents to heat their home, even though it is only about 20% larger -- though it is difficult to compare directly since I live in an apartment) because we reuse excess heat from some local factories (and make them a bunch of monies in the process).


It's an intriguing idea, mainly because it's being done in a limited way already by some. I have only six hardware servers in the basement, running some 20 VMs and 20 TB of backup disk, connected to the Internet through fibre. They only use about 500-600W, mainly idling most of the time, which is under 5kWh/year in electricity.

Total house power use is about 28-30kWh/year. We use a heat pump for heating and hot water, which takes heat from two 100m boreholes. Which probably gives us about 2.5W of heat for each 1W of electricity it uses. We live in a moderately cold climate and the power use is about average for the size and age of house.

The biggest issue may be the local IP infrastructure as we only have 100 Mbps delivered ($50/month non-commercial use, $750/month commercial) and a 2 Gbps local loop for thousands of households. I could get an upgrade to 1Gbps to the door (or basement) but haven't really needed it so don't know what the cost would be, particularly if the load was constantly high. I could imagine the price would go up.

Not sure the economics of it would stack up, but maybe.


Check your units. 500W is 12kWh per _day_, or 4000kWh-ish per year. Do you mean 5MWh?

Total hose power usage most likely is off by a huge factor, too.


Yes of course. I was doing 0.55Wx24hx365=4818 kWh/year and it is about 30,000 kWh/year for the house. Easy to check, as it is about 1 SEK/kWh or so, and I have an average bill of about 2500 SEK / month. (It's what I get for trying to do maths when supposedly on holiday. ;) It was 5k kWh, ie 5MWh).


I live in an older (1920s) house in upstate NY. I would certainly love the to save money on my heating bill but there is no way the electrical infrastructure in my house will support a cage full of servers.

How many houses have have the necessary electrical service to power a cage full of servers without the installation of a new box/service?


If this plan (by some wild chance) became reality, the data utility could add a new box electrical box. That's one of the easiest issues to solve.


How many amps do we need to have on the home's supply? My home is a newer home and has 200 amp service. Lots of older homes around the US have 100 amp service or even smaller.

Is this furnace going to clamp onto the existing supply or do we need to get additional capacity into each home? If the system takes the place of the heat pump in the winter that's okay but what about the summer? You mean I'm going to have a furnace and an A/C unit fighting for electricity in my home?

Adding capacity isn't always as simple as calling up the electrician and adding a phase to the breaker box. Now we're stringing cables, digging up backyards, and running lines to transformers in neighborhoods (like mine) where the infrastructure explodes if a squirrel farts in the wrong direction (I'm looking at you, ComEd).


Taking heat to the people is good, but what about bringing people to the heat? Start selling apartments in the data centers.


Agreed. IMO, this is the only way to do this sort of thing. Build apartment buildings on top of or very close to data centers.


I'd live above a datacenter if I could. Not for the heat, but for the connection speed - imagine multi-gigabit connections per apartment!


Cheap power, free heat, and great internet. How long until hacker communes spring up (no pun intended) around them?


I often wonder why large datacenters are not put in colder places and use the temperature difference between inside and outside to generate electricity (sterling engines, thermogenerators) to help offset the electricity used to generate that heat?

Perhaps the relative inefficiency of the heat engines vs the cost to implement them? Still, you would think retrieving some electricity after generating all that heat would be useful.


Probably because those devices cost more than the electricity they make.

The trouble is that the heat is diffuse - it's not concentrated in one spot, and collecting it from an entire building is not practical.

(Just in case you are wondering, if you have active cooling then the heat is concentrated, but capturing it will raise your cooling cost, and it will raise it more than what you gain by capturing it.)


I think the major problem is that you need a high temperature difference for efficient power generation. Assuming an outside temperature of 0 celsius, and a maximum temperature of 100 celsius, you have a theoretical maximum efficiency of around 25%, and in practice much less than that. (see Carnot cycle)


"Facebook is to build a new server farm on the edge of the Arctic Circle – its first outside the United States – to improve performance for European users, officials of the social networking site said Thursday." - http://huff.to/vHOqne


Google's doing it as well. This is an example of an old paper mill (with great water cooling possibilities) that was converted to a data center. http://www.google.com/about/datacenters/locations/hamina/


Two years ago at our student conference we had a guest lecturer from a dutch company called "Kyoto cooling" they install these kinds of things in data centers. The guy said this kind of cooling works when the temperature outside is <22 degrees Celsius.


Frustrating. Hard to evaluate as a biz idea without knowing how valuable these things are - much energy/heat they would actually provide.

Also, I don't see a reason why it couldn't do realtime stuff as well...

And if this was at microsoft research why did microsoft not do something more strategic with this? They can't start running azure nodes in people's houses? Maybe it's because the economics don't work out...


Microsoft Research is not a R&D department, most of what they do probably isn't used by Microsoft directly.


I read this was to be done at Telehouse West (http://www.datacenterknowledge.com/archives/2009/04/15/teleh...) last year but haven't seen any detailed follow-up.


The article doesn't discuss it, but the CDN potential here is huge. This would be great for a company like Akamai, without too much confidential information on its machines.

You could also rig the machines to auto-delete if the cage is opened. Though most people don't go poking around their generators or furnaces.


A friend used to work in security at Akamai, and in fact he would bring home stories about the secure enclosures he was designing for insecurely colocated machines. They were supposed to emphatically destroy the data on the box if it was opened.

This would have been in the early 2000s.. no idea what, if anything, made it into production.


Plus, if the software update or whatnot you wanted to download was on the CDN node in your basement, you might be able to download at LAN speeds!


They can address power and network redundancy in software. What they can't address is security: one of the primary reasons data centers exist.


They can and already have solved the security issues by full hard disk encryption of the servers. Any attempt to open the server rack or tamper with the hardware triggers a dead man switch that powers off the servers. After losing power, the systems must have a passphrase typed in over a remote secure console before they can boot. Many Fortune 500 companies already do this with all of their laptops. By using full disk encryption, if a laptop or server goes missing, even if it had a hard drive full of SSNs, they don't have to report the data loss because the data was encrypted.


The kind of nerds that would jump at the chance to install this sort of stuff (those people TFA referred to as already heating their homes with servers) are the exact same people who would jump at the chance to drill out the back plates of the cabinet to defeat the tamper switch(es), then freon your ram to steal your keys... just because it's there, tempting them.

http://xkcd.com/916/

These issues have _not_ been solved. Tamper-resistant and tamper-proof are entirely different concepts.


Right, you can still get around this with the good old "use freon to freeze the DIMMs then search RAM for the encryption key" trick. Realistically, however, you need to look at the value of the data as opposed to the cost to retrieve it. For example, if all that is stored on the servers are a bunch of MPAA movies for streaming content (Akamai CDN or similar), the cost to a hacker to retrieve the data is not greater than the value of the data itself. After all, any content on Akamai CDN can most likely be found on some other channel like Bittorrent at a much lower cost than drilling out a hardware case and using freon on the DIMMs. The sophistication needed for that type of attack is only viable if the target data is worth $millions, and I suspect that nobody will be using home furnace servers for that type of processing.

The key is to make the cost of accessing the data greater than the value of the data. If the value of the data is < cost to retrieve it you have a viable business model.


a lot of things don't need that much security, numerical compute jobs or static file serving


The integrity of any "static file" which is a HTML or JS component of a web application is necessary to the security of that entire application. (Unfortunately.)


Lower-hanging fruit along the same lines is figuring out a way to harness energy created by exercise equipment (treadmills, ellipticals, etc). I've often thought how dumb it is that all that energy is just burned off as friction. Seems like this would be a lot less complex to implement.


However a quick back-of-the-envelope calculation would rapidly show that muscle energy is dwarfed by possible savings on ordinary energy uses. A human being on a treadmill can't generate much more than 100W of usable energy, burning 2000 Kcal/hour doing this.


I don't think gp is suggesting using people to generate electricity through exercise. Finding a way to harness energy released when lots of people are exercising anyway (i.e. at a gym) might generate an appreciable amount of energy, with no change in human behavior.


An ancient way of doing that was to have living quarters near or on top of cattle stables. That way, the body heat of your cows would heat your living quarters in winter.

A quick Google shows that there are modern ways of doing this: http://www.telegraph.co.uk/news/worldnews/europe/austria/144...

There also have been proposals to use the excess heat and manure of pigs to grow mushrooms.


The title alone makes me want to see an image taking on data centers as the new Dark Satanic Mills.


A+ for creative thought!

But I think this will never be done by big business, because it sounds too much like a legal liability and security risk.

Still, it's a really cool idea and it might work for less sensitive servers.


This would also solve the fiber to the home problem (it's costs $10,000 per house)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: