Rackspace joins Cern openlab to develop federated clouds for LHC research

Posted on March 17, 2014 at 12:40 pm

European physics laboratory Cern is working with Rackspace on a project to link together multiple clouds used by the site along with other research centres and public cloud computing resources, in order to provide scientists with the compute power to research data from the Large Hadron Collider (LHC).

Unveiled today, the partnership sees Rackspace join the Cern openlab public/private collaboration scheme as part of a research project into linking together multiple clouds. The move is expected to provide Cern scientists with huge amounts of compute power to research results from the LHC, while Rackspace hopes to gain valuable experience and insight into best practices around cloud interoperability and managing large-scale cloud infrastructure.

Tim Bell, infrastructure manager at Cern, told V3 that the simulations that scientists want to run sometimes exceed the capacity available at the centre, and so the project will seek ways of expanding compute capacity by linking to clouds at other research facilities and public Openstack clouds.

“We get around 35 petabytes (PB) a year from the LHC when it’s running, and this data is analysed and recorded using the set of machines hosted at Cern, plus an additional datacentre we’re just setting up at Budapest. We’re in the process now of converting what was a set of physical servers into a large-scale Openstack-based cloud,” Bell explained.

“We expect by the time the [LHC] starts up operations again in 2015 to have around 15,000 servers across those two datacentres to handle the data recording. On top of that, the scientists need to be able to simulate collisions, to visualise what the theory would predict and then analyse the results of the data to compare what real life is looking like against the theory.

“Typically, we would use resources that we have here at Cern, but if there is a conference coming up, we need to run additional programs to analyse the latest data and the workload often exceeds the capacity that we have available. That’s why it is interesting to be looking at being able to take advantage of public cloud resources without having to permanently enlarge the datacentre here,” he said.

Nigel Beighton, vice president of technology at Rackspace, told V3 that his firm is part funding the project, which is expected to pay dividends in new standards for linking clouds together.

“Cern is one of world’s largest producers of data. They need large banks of compute power to deal with the amount of data they’ve got. We’re going to be looking at the best technological solutions to do that across multiple clouds,” he explained.

“The best outcome at the end of the day will be if two things come out of this: one is the technology to allow people to connect their clouds together, and second, given Cern’s heritage in open standards, there emerges a set of open standards for broader interoperability of clouds,” Beighton said.

Cern is now operating three Openstack clouds, according to Bell. In addition to the one operated by the site’s IT department, there are two large server farms associated with the CMS and ATLAS experiments, normally used to filter the 1PB of data per second that spews out of the detectors down to a reasonable volume that can be recorded.

While the accelerator is being upgraded over the next 18 months, these two server farms, comprising about 1,300 servers in total, are being spun up with an Openstack cloud each to provide extra resources.

“There are also other locations around the globe affiliated with Cern and running Openstack, such as BrookHaven National Laboratory and Nectar Labs in Australia, so it gets very interesting to look at how these resources can be connected and do better sharing. It is both a private to private and private to public federation that we are envisaging” Bell said.

Cern is also looking into using tools such as Hadoop in order to make use of the techniques that others are using for large-scale data analytics, according to Bell.

Meanwhile, Cern also has some impressive IT infrastructure just to control and operate the LHC, as V3 reported last year.

Posted in Cloud Hosting