Tag Archive


Aashish Aashish Dutta Koirala Aashish Koirala Add new tag basantapur bhaktapur Buddha children d2 Daman deerwalk Dhulikhel football Godavari Hike Hiking hiking in nepal indra jatra Kathmandu Life Life in Nepal Life In Nepal Photograph LIN love nagarkot Nepal Nepali Nikon oracle pashupatinath patan Peace People photography Pokhara Ravi Sharma Sankhu Shutterbug Sundarijal Telkot temple Tihar US village Women

First Steps to Digital Detox – Room for Debate Blog – NYTimes.com

Russell A. Poldrack is the director of the Imaging Research Center and professor of psychology and neurobiology at the University of Texas at Austin.

As a busy researcher who owns an iPhone, iPad, and several computers, I often find it very difficult to practice what I preach when it comes to the dangers of multitasking (though I absolutely never talk on the cellphone while driving).

Our research shows that multitasking can have an insidious effect on learning, making it less flexible.

I think that the first key to successfully unplugging is to gain some insight into the effects that multitasking and information overload have on our own minds. As nicely discussed in the book “The Invisible Gorilla” by Chris Chabris and Dan Simons, humans are often very poor at understanding how our own minds work, and multitasking is a perfect example: Everyone thinks that they are one of those 3 percent of “supertaskers,” even as the scientific data shows that multitasking takes a serious toll on our performance as well as on our emotional lives.

Our research has shown that multitasking can have an insidious effect on learning, changing the brain systems that are involved so that even if one can learn while multitasking, the nature of that learning is altered to be less flexible. This effect is of particular concern given the increasing use of devices by children during studying.

via First Steps to Digital Detox – Room for Debate Blog – NYTimes.com.

RYW Read-Your-Writes consistency explained | DBMS2 — DataBase Management System Services

The core ideas of RYW consistency, as implemented in various NoSQL systems, are:

Let N = the number of copies of each record distributed across nodes of a parallel system.

Let W = the number of nodes that must successfully acknowledge a write for it to be successfully committed. By definition, W <= N.

Let R = the number of nodes that must send back the same value of a unit of data for it to be accepted as read by the system. By definition, R <= N.

The greater N-R and N-W are, the more node or network failures you can typically tolerate without blocking work.

As long as R + W > N, you are assured of RYW consistency.

Example: Let N = 3, W = 2, and R = 2. Suppose you write a record successfully to at least two nodes out of three. Further suppose that you then poll all three of the nodes. Then the only way you can get two values that agree with each other is if at least one of them — and hence both — return the value that was correctly and successfully written to at least two nodes in the first place.

In a conventional parallel DBMS, N = R = W, which is to say N-R = N-W = 0. Thus, a single hardware failure causes data operations to fail too. For some applications — e.g., highly parallel OLTP web apps — that kind of fragility is deemed unacceptable.

On the other hand, if W< N, it is possible to construct edge cases in which two or more consecutive failures cause incorrect data values to actually be returned. So you want to clean up any discrepancies quickly and bring the system back to a consistent state. That is where the idea of eventual consistency comes in, although you definitely can — and in some famous NoSQL implementations actually do — have eventual consistency in a system that is not RYW consistent.

Much technology goes into eventual consistency, as well as into the data distribution and polling in the first place. And in tunable systems, the choices of N, R, and W — perhaps on a “table” by “table” basis — can get pretty interesting. I’m ducking all those subjects for now, however, not least because of how much I still have to learn about them.

One point I will note, however, is this — RYW consistency and table joins make for awkward companions. If you want to join two tables, each of them distributed across some kind of parallel cluster, there are only two possibilities:

  • In most cases, the data you need to join is co-located on the same nodes.
  • You’re going to have an awful lot of network traffic.

In an R = W = N scenario, co-location may be realistic. But when R < N and W < N, a join can return incorrect results even when both of the tables being joined would have been read correctly.

In our example above, we had N = 3 and R = W = 2. Single-table RYW consistency was ensured. But suppose you join two records, each of which had been written correctly to 2 out of 3 nodes — but with only 1 node being correct about both records. Then only that 1 node out of 3 will return a correct value for the join, and badness will ensue.

Any architecture I can think of to circumvent that problem results in — you guessed it — an awful lot of network traffic.

via RYW Read-Your-Writes consistency explained | DBMS2 — DataBase Management System Services.

FAWN: Fast Array of Wimpy Nodes for Sun Oracle, Google, and Facebook

FAWN: Fast Array of Wimpy Nodes for Sun Oracle, Google, and Facebook:
by mbenedict October 16, 2009 7:12 PM PDT
It’s not just about the raw cost. There’s a finite amount of electricity you can bring to a data center, so at some point the number of queries you can do per kWh becomes very important. The article mentions heat as waste but like electricity, heat itself also becomes a limiting factor in a large data center. There’s only so much cooling capacity available beyond which you get severe diminishing returns.

So a system which promises to be more energy efficient and runs cooler at the same time… that could be a big win.
by symbolset October 17, 2009 8:19 PM PDT
I’ve been a proponent of FAWN for a long time. For ten years the software has provided the redundancy and the scale. FAWN is not the right answer for every problem, but no tool is.

Configuring the right solution for massively parallel problems is a fairly complex geometry. If you approach a large-grain problem from a cents-per-compute-per-second perspective then FAWN is a slam dunk. For fine-grain problems you want to use GPGPU instead. When the problem becomes large enough, custom system boards and esoteric processors enter the solution set.

It’s really only when you don’t know the granularity of the problem, or you need a general solution that solves both ends of the granularity scale and the middle too that Industry Standard architectures are ideal. In these cases a mixed cluster of wimpy nodes combined with GPGPU nodes may be more cost effective.

Oh, and about cooling: The answer to many problems that start “How do you…” is… don’t. As many have shown the correct answer to the cooling problem is not refrigeration, it’s location, location, location. Your servers are rated to 35C (95F) at least, and if the ambient temperature where they are rises above that, you located your servers in the wrong geographic area, which is a different problem. There are lots of places you could put your servers that won’t get that hot in the next decade. Put your servers some place where the ambient temperature never goes out of range, preferably where they have cheap power (I hear Canada is nice). To find the ideal operation for the fans of your datacenter, heat the inlet temperature to 35C. Fire up the equipment and stress test it at maximum capacity. Measure the outlet temperature. Now you have the ideal outlet temperature. Regulate the fan on the exhaust such that the exhaust is consistently that temperature, less a few degrees for safety, and your server components will remain at a consistent temperature (thus preventing swings in temperature which can cause problems). This is not as complicated as you might think. As an added benefit during a “heat wave” stationary inversion the thermodynamics of a hot exhaust plume exiting high above the building plus the related ground-level cool air inlets creates a cooling breeze which diminishes the air conditioning required to cool the humans in the related office spaces when they’re not in the datacenter. Don’t insulate the datacenter part either – that’s swimming upstream. Maintaining a snow load on the roof should not be a design goal. Also, in really intemperate climes filter the exhaust and pass it through the human workspaces (or if you’re really fussy, use a heat exchanger) – the servers are heating air, there’s no sense burning extra energy to heat separate air to keep the humans comfy.

by ckurowic October 18, 2009 11:01 AM PDT
I disagree with your point of recirculating the hot air from the servers to people’s work areas. Some are VERY sensitive to the outgassing that occurs when equipment is new (and even for many months afterward). You have interesting concepts, but I’m afraid you don’t have the engineering background to support it.

by Christopher_Mims October 20, 2009 11:47 AM PDT
Great article – provides a lot of detail that didn’t make it into my own write-up of FAWN for Technology Review. If you’re interested in a slightly different take, though:

http://www.technologyreview.com/computing/22504/?a=f

“We were looking at efficiency at sub-maximum load. We realized the same techniques could serve high loads more efficiently as well,” said David Andersen, the Carnegie Mellon assistant professor of computer science who helped lead the project.
It’s not just academic work. Google, Intel, and NetApp are helping to fund the project, and the researchers are talking to Facebook, too. “We want to understand their challenges,” Andersen said.
Cut the power
These large-scale systems don’t come cheap. Besides the hardware, software, and maintenance costs, there’s power, too–and companies often must pay for energy twice, in effect, because servers’ waste heat means data centers must be cooled down.
by catbutt5 October 16, 2009 11:51 AM PDT
Oh, I needed a good laugh…

“And addressing the brains… Anil Rao is one inventor on a … patent applied for a computer system with numerous independent processor modules that share access to shared resources including storage, networking, and boot-up technology called the BIOS.”

Trying to patent something that’s existed for more than 20 years are you? Good luck with that.

Anil, ever heard of Sun or IBM or companies that sell refrigerator sized (small and large) computers full of little card slots containing memory and processors (even at different frequencies) that share storage, networking and yes, even the BIOS? It’s the same concept.

What’s your act 2? Gonna try to patent the automobile?

by kirkktx October 16, 2009 12:54 PM PDT
“52 queries per joule of energy compared to 346 for a FAWN cluster”

Somewhere I saw that electricity costs exceed hardware costs amortized over the life of the computer. These numbers should certainly attract investors.

http://news.cnet.com/8301-30685_3-10376537-264.html?tag=mncol