Tuesday, August 13, 2013



Apparently, this is turning into “begging for ideas” week.  Please forgive the dreadful manners.

I know juuuuuust enough about IT to be dangerous.  I recently heard an IT idea that strikes me as obviously great, but experience has taught me that ideas that look obviously great at first blush can hide great sins among the details.  So I’m hoping that some folks who have been through this can shed some light.

The idea is “virtualization,” and my bowdlerized understanding of it is as follows.  In traditional on-campus computer labs, every computer has its own CPU and performs its own calculations and processes.  The computers are networked to each other and to the internet, for obvious reasons, but each is capable of doing some pretty serious internal processing.  If you want to run a program on all of the computers in a lab, you have to install it on each computer individually.  Although much of what computers do is online now, we still pay for and maintain all those separate computer brains within each station.

In virtualization, as I understand it, the brains are moved to a centralized location (or locations).  So instead of thirty different machines each running its own version of BasketWeaver, thirty terminals are connected to a single server running a sort of uber-version of BasketWeaver.  

The appeal is twofold.  First, it’s easier to maintain one program on one server than to maintain thirty installations on thirty CPU’s.  Secondly, you can get away with “dumb terminals” on the user end, allowing for cheaper upfront cost. less maintenance/repair, and more consistency.  You don’t have to worry that some machines are running BasketWeaver 2010 and others are running BasketWeaver 2013; you can ensure that whatever version is running is running everywhere.  From a teaching perspective, that’s a real gain.

Obviously, virtualization requires some big honkin’ servers -- I think BHS is the technical term -- and plenty of bandwidth.  But if you have those, you can save the cost and headaches of trying to maintain hundreds of CPU’s across campus.  

If that understanding is broadly correct -- I’m basically thinking of the distinction between a chromebook and a laptop, writ large -- then it seems like it should be a no-brainer.

(I’ll grant upfront that there may be some very specialized use cases in which the old model still makes sense: dedicated Macs for Graphic Design, say.  But even if, say, twenty percent of the student-use computers had to remain in the old model, the savings for the college in terms of both money and maintenance would be substantial.)

But true no-brainers are rare.  There has to be a catch.

So this is where I’m hoping that people with battle scars will shed some light.  What’s the catch?  What’s the “you wouldn’t have thought of it, but this detail will kill you” life lesson?

My company has many users who use SAS (the statistical software). We switched a few years back to have it "virtualized" as you described. The drivers were mostly cost related. On net, it was probably the right decision for us, but there are a few draw backs to be aware of:
1. Single point of failure. If the server goes down, nobody can do anything. If you have individual machines, if one breaks, you can just go to another.
2. Network outages. Similarly, if the network goes down, you can't do anything.
3. Pilots of upgrades. If you want to upgrade anything, you can't do it slowly to make sure there are no compatibility issues. You go all in with the upgrade, and hope it works.
4. Consistency of speed. When I have my own machine, I know how long it takes to do things. In a virtualized system, how long it takes to do things depends on how many other people are doing things at the same time. If someone is doing something that take a lot of resources, it slows things down for everyone.
5. Ease of use. This is obviously implementation dependent, but how well does it feel like you are working on your own computer, versus logging in to a server? If I plug a USB drive into my computer, how hard is it to access? For accessibility devices, how well do they integrate? I've seen set ups for this where everything looks just a little bit "off" when you are accessing virtualized resources.

I don't think that's what virtualisation is, that's your bog standard mainframe idea. You have some program that is designed to have multiple users at once (which Basketweaving may or may not be designed to do) running on a big box with all the graphics and smarts on the big box and your users connect to it.

Virtualisation is when instead of having a bunch of physical boxes you have a big machine pretending to be a bunch of physical boxes. Each of these pretend boxes is called a Virtual Machine (VM).

Each VM is still running its own copy of Basketweaving which has no idea it is not running on real hardware. The box looks to it and to the operating system and the users just like a physical box.

The advantage of that is that you can "spin up" (create) a brand spanking new VM any time you want.

This new VM has whatever version of Basketweaving you want on it, from the library of possibilities you have access to. So for UnderWater Basketweaving 201 they have BWv3 with the Snorkel database, and for Remedial Basketweaving they have the Pliers attachment and they never see a Snorkel at all.

When not being used, these VMs can be deleted so they are not using resources. PLus any malware that has made its way onto the VM is destroyed too.

To make this work, you have to have a good design.

You have to have the right software on the big honking box (there are several ways to do it) for your requirements.

Most of the rest of the infrastructure you probably already have in your computer labs, depending on what you currently do. Things like authentication (who can log on where) and data saving (where do the students save their work to and how is that secured).

Plus of course what you do if the honking big box falls over....

This is the sort of thing that works well in The Cloud. You could work with Amazon to have your computer labs hosted there, with your various master images that are spun up for each class and spun down (so not costing anything) the minute the class is ended for the day. They offer a private network system so you can join it to your campus network and not have it exposed to the big bad internet.

You will need a good tech dept to manage it, even if hosted on Amazon. You don't need PC techs but you do need people who understand how the VMs and the networking work and how to create the images and have them work with your infrastructure and how to secure them.

And how doing this affects your software licences...

(Yeah, this is my day job)
I was going to post a short version of what Rider said longer and more authoritatively. I was on the search committee for our new CIO this spring and summer, and virtualization was a topic that came up a lot in our interviews. You need a good (fast and reliable) network infrastructure for it to work, but there are significant advantages in a number of areas. It's especially attractive for servers, since users don't generally need physical access to them.
I want to reinforce how Rider ended. Ask about the software licenses for Basketweaving, other specialty software such as Analytical Calculations Kode (ACK), and generic (word processing, spreadsheet, ...) before looking too far into the other aspects.

I know that our IT can devise a single PC set up that is then delivered over the network onto the other PCs in a common lab. Presumably that facilitates keeping a bunch of computers w/ the same Basketweaver, ACK, etc. I assume your IT people are already doing this (it isn't a new thing).
There's no "this detail will kill you" here necessarily; the benefits and potential cost savings of desktop virtualization are real, but there's always things which mean that they're not as large as you'd like.

The first thing is the thin clients (dumb terminals); generic low-end desktop PCs have an economy of scale to them that the clients don't (this may change). So you don't save as much on them as you'd like - remember that chromebooks are being sold essentially at cost for longer-term reasons.

The second is administration; serving things out centrally is indeed than managing each lab PC individually, but your techs almost certainly have systems for re-imaging lab PCs and otherwise automating the administration already. Annoyingly, the better your tech staff in this respect, the less administrative savings you'll get.

Some of the other costs have already been mentioned; you will have to have beefed up infrastructure (network, and the racks of servers; maybe involving upgrades to your machine room?). Your tech staff will have to re-learn on the new system. Also, you'll have to make sure that all your key software (and any peripherals used in the lab!) work in this environment, which isn't necessarily a given. Software licensing in this environment may be cheaper, may be more expensive, or may be the same, and will likely vary package to package.

You're going to have to balance an initial cost in infrastructure (networking, possible machine room upgrades) plus the costs of regular upgrades of (dumb clients + servers) with the costs of regular upgrades of the regular PCs. And obviously costs like beefing up networking, if necessary, will have advantages you'll be able to use in other ways.

The biggest cost savings with desktop virtualization come when you know that at any given time only X% of the desktops are actually used; then you can replace N desktops with N cheaper clients _and_ X% of N worth of backing central resources (including licenses). My guess is that this doesn't apply to the labs in an education environment, but I may be wrong.
Our CC is the same as the above, only different, with two common types of virtualization but not the one you appear to be discussing.

One thing I picked up that might not be on your radar is the question you need to ask: "What are we doing now?" Sort of looking at your actual expenses before realizing you need a better home budget. That exercise helped our campus improve how it allocates resources. You have to know what it costs now, and where those costs live, to make a rational decision.

Our CC has a blade server that runs many VM's, often with different operating systems. As noted, this makes certain kinds of migration easier. It does not, however, have everything on it.

All of our campus PCs (except some office ones) are locked down with an "image" that resets at logout and that is reimaged each year. Guess what IT is doing this week! This works great when done correctly. They do, however, have to maintain different images (computer lab, science lab, general classroom, special classroom, offices) but that mess will exist forever.

We have fully virtualized several major campus utilities and will be doing the same with another. The Software That Shall Not Be Named lives somewhere off campus in some other state. This works great except when their machine goes down, but at least you know many other colleges and universities are feeling the same pain and that the bigger guys will definitely make sure it get fixed promptly. What makes this work well is that very little data moves around. We need a bigger outside pipe (the data flows into campus rather than within campus), but we need that anyway.

Network capacity is critical for other things, however, and might be a bigger deal for what you want to do. We noticed some major problems each time we rolled out some innovation that required more bandwidth, like central video service or just people using networked utilities like Basketweaver or even YouTube. This requires some thought. Have you isolated students using wireless to watch old TV shows on their phone during class from everything else? ;-)

I know the least about the sort of central serving you are talking about, except for fading memories of the good old days on a dialup modem to the only computer on campus. In theory, you don't need lots of cpu power on each lab terminal when the students are only using Facebook or writing a paper or accessing Basketweaver or campus e-mail. That changes when they are watching a video, particularly one delivered by Basketweaver, so there are limits on the economies. Can you get the same economy with less disk and memory on real computers? Or can you just require them to own a phone or tablet or whatever that can run a virtual copy of Word?
I've worked at several research stations that do this and it seems to be very popular in federal gov't, so if you have a research station nearby, ask their IT as well.

From the user end-point, the issues are usually some variant of access and speed. In other words, "virtual" software tends to work slower because most people tend to use it the same time of day (9.00-17.00). I've had a virtual MS Office, and it attracted more hatred amongst workers than the bureaucracy did (an impressive feat in gov't). On the other hand, my current institution has a virtual stats program, and people are generally happy with it (morale is similar at both places, so I assume the propensity to complain is similar as well).

My very strong recommendation is to use central server software licenses ONLY in cases where students have not used the software before on a daily basis (BasketWeaver, StepOne, SPSS, SAS). If people have a sense of how fast a program ought to be running (e.g. MS Office), they're much more likely to hate central server licenses. Of course you run MS Office centrally anyway and install OpenOffice, an equally capable office suite, on all the terminals.
My background is not like that of most of the commenters so far; I'm a user, not a systems person. My institution tried moving to virtualization (of the standard software build) for faculty, and it did not work well. We did have server issues, and we did have response-time issues. But the real issue is that the faculty tend to have very indisyncratic software needs (or wants), and almost no machine had only the standard build. (I'm a case in point; I had, at that time, probably a dozen software packages for which I was the only user on campus. Most of these were specific to my research or were tied to specific courses that only I taught.)

So for faculty machines, "dumb terminals" may not be practicable.

With labs, a similar issue may arise, in that some programs (e.g., language programs) may be needed only in designated labs (where you need staff to assist in the use of the programs). Virtualizing that may be possible, but then you may need to have (effectively) licenses for all the machines, when you really only need it for 30 or 40.

This discussion reminds me of the general controversy that has risen around the concept of “cloud computing”, which is now beginning to enter the home computer market as well as the business and university environment. In the cloud model, no longer is the computer’s work being done on the user’s machine--it is now being done on a remotely-located server. Not only is the user’s data stored on the remote server, the software needed to process this data is stored and run there as well.

Adobe has announced that future releases of its Photoshop, Dreamweaver, etc offerings will no longer be available for customers to purchase and install on their own computers, but will now be located on a remote server and will be assessable to customers over the web for a fee. I have heard quotes of $50/month for access to the Adobe suite, although I think that they will be offering a "teaser" rate of $30/month to get you hooked. Why are they doing this? I suspect that the answer to this question is the answer to most questions today—money. I suppose that Adobe was tired of having their software pirated on a large scale. A lot of people are complaining about Adobe's move into the cloud. Just about everything you want to do with an Adobe product will now probably cost you more. I just heard that Microsoft wants to move to a subscription model for its future Windows releases.

Cloud computing has the advantage of not requiring me to buy and maintain my own software, and I can purchase a less-expensive computer, one which is essentially only a “dumb” terminal which has as its only function to connect with the Internet. In addition, I don’t have to worry about software upgrades, and don’t have to worry about getting and installing all of those pesky updates from Microsoft or Adobe. I don’t have to worry about my hard drive crashing and losing everything. I don’t have to worry about getting and installing new software every time Microsoft decides to stop supporting the version of Windows that I am currently using.

Concerns in next section, however.
Bottom line: this is not a strategic decision, it's a tactical decision, and depends on your network set-up, what software you're running and how it is licensed, your IT staff's capabilities, the tools IT has to manage networks and update computers remotely, etc.
It's not the kind of decision you should be making: If you have a good IT manager, they're already doing (or thinking about) this where (if) it makes sense. If you have a bad IT manager, you really don't want to push them into doing something they don't understand.

But there are concerns and problems with cloud computing. One drawback of cloud computing is that the users' data are stored on the cloud provider's server, and not on their own computers. As a result, the user does not have as much control over their own data, and there could be unauthorized access to the data. When my sensitive information or financial data is kept on my own computer, it is absolutely secure and if someone else gets hold of it, it is noone else’s fault but mine. But when I put it up in the cloud, is there a danger that someone with evil intentions may be able to get access to it? I would be trusting the cloud provider to be careful about security and not to be negligent. Whose fault is it if my business plan ends up on my competitor’s computers? But I suppose if there were a major security breach at a cloud provider, the resulting bad publicity could wreck their business model.

There is also a danger that a software glitch or hardware failure at the cloud provider could result in a corruption of the data or even a loss of the data. The end user is at the mercy of the competency of the cloud provider, and is essentially trusting them to be able to handle security and backups in an effective manner. As a business owner, I would be wary of outsourcing key functionality to some outside organization over which I have little control.

Privacy advocates have criticized the cloud model for giving the hosting company a greater amount of control. They can monitor at will any communication between the host company and end user, and can access user data (with or without permission). Is the cloud provider going to be bombarding me with advertisements or are they going to be selling my private information to third parties?

There is also the issue of lock-in—cloud providers often deliberately make it difficult if not impossible for the end user to migrate to a different and competing platform. One reason I have avoided E-books is that I don’t want to be locked into a particular platform, such as the Amazon Kindle. Can I read the e-book on a different platform? What happens in the unlikely event that Amazon.com goes belly-up.? Do I lose my library of e-books if this happens? If tools built to manage a cloud environment are not compatible with different kinds of both virtual and physical infrastructure, those tools will only be able to manage data or apps that live in the vendor's particular cloud environment. My company would become locked to a particular cloud provider, and if I became unhappy with my provider, I would find it very difficult to migrate to a different provider.

Who actually owns the data once it lives on a cloud platform? Does the customer still own it, or does the provider now own it? If I put a copy of a photograph, my company’s business plans, a monograph, a video, or a song that I created up on the cloud, do I lose my intellectual property rights to it? Does the service provider now somehow “own” what I stored up there? In addition, there is a worry that you won't be able to get your photos back if you Photoshop them in the cloud. They will be converted into their proprietary format, and you might not be able to view them on your own computer.

I also worry about being “nickel-dimed” to death, by having to pay for virtually every mouse click if I relied on cloud providers for all of my software needs. In the good old days, when I create a Word document, it resides on my machine and I can do anything I want with it anytime I want to. But now that document will be sitting up on some server somewhere and I will probably have to pay every time I want to read or edit it. I would be at their mercy if they managed to lose or corrupt the document.

So I am of two minds on the matter of cloud computing. But it’s coming.

Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?