EMC Celerra NS20

2009-06-18 Initial Post

My experience working with EMC has not been very good. It’s like a big bureaucracy. To deploy our NS20s, there were several different EMC employees involved. They had an architect, project manager, service delivery coordinator, and some others. For example, the architect would design the system, then some other guy would come out just to plug everything in and get it on the network, and then another guy would come out to carve up the disks. I understand that the system has a lot of components and is complex, but this is just absurd. No wonder why the darn equipment costs so much—EMC has to pay the salaries of all these people.

I don’t know how NetApp or the other competitors operate, but I can’t imagine any organization being less efficient than EMC. For what we use, we didn’t even need something like the Celerra. It’s too complex and costs too much for our needs. I would suggest that a SMB look at simpler alternatives. The life cycle of these storage products is only a few years, so it’s not like you’d get stuck with a unit for 30 years. And at this point, most vendors have the same features, so storage is becoming a commodity product.

Below are my notes from the course EMC NAS Operations and Management with Celerra (version 5.6, on-site, instructor led) from 12-01-2008 to 12-05-2008

The student guide needs some serious proofreading and the lab guide needs some serious technical QA’ing. We spent several EXTRA hours on the labs because steps were missing or not in proper order; it was very frustrating. Along with that, the test environment was very slow and we kept getting disconnected from Celerra Manager, whether we used IE or Firefox.

The computers in the lab were still on Windows 2000! Between the bad lab guide, the slow test environment and other issues, we literally wasted at least two hours during the five-day course.

The Celerra is actually very complex, although it’s marketed as being easy to manage. It seems contradictory, but the complexity is abstracted so that the day-to-day sys admin doesn’t really need to know all the details. The complexity arises from the fact that the Celerra can do a lot. Not only does it perform NAS functions, but it can also be used for block storage with iSCSI or FC. And the Celerra actually consists of several components: Control Station, Data Mover, and Clariion backend storage.

Anyway, after taking this course, I’m even less impressed with EMC. The instructor was very articulate, but he had no actual real world experience. How the heck do you teach a class on this type of topic with no real world experience? We could have really used an instructor who could have given us real word tips.

This is the second in-class, instructor led course that I’ve taken for work and I honestly don’t know if I want to take another one. The first class I took, for Packeteer iShared, was also taught by an instructor with no real world experience. I did not find the course materials in either course particularly well written or easy to use.

Anyway, here’s my take on the Celerra:

1.) There are several components to the device, and this course did not cover them all in any detail since it was geared towards administration, not initial setup. Normally you would have EMC physically install everything for you and perform the initial setup, so you don’t really need to know how the components are interconnected. There were some self-paced courses on the hardware and architecture of the Celerra that I had read before this course, so that helped me to understand this better.

There are too many separate components (either software or hardware) to the Celerra and the Celerra Manager is too cluttered. They should remove some of the icons to streamline it. It gets confusing remembering where to change a setting such as NTP. Is it under Network folder? Nope, NTP is configured under the Data Mover folder.

Compared to NetApp FAS2020/50, the Celerra seems more complicated to use. Some will argue that the complication results in better performance, which could be true. But for a small company with no dedicated storage admin, I think that any real performance gain is a small tradeoff considering how much better integrated and easier to administer the NetApp is.

2.) I found it difficult to understand how the Celerra accesses the backend disk drives. That section was not very clearly written and I had to read it a few times. The whole point of the Celerra is to abstract a lot of the workings of the back end—even for administrators. Basically the key points I got were:
a. The backend—whether Clariion or Symmetrix presents its drives as LUNs to the Celerra. The Celerra then makes volumes, slices, etc., out of the LUNs. When you think of this, you might be wondering what an iSCSI LUN presented by the Celerra is. From what I figure, a Celerra iSCSI LUN is not a true LUN because it’s comprised of a backend LUN, so there must be a slight performance hit as compared to accessing the LUN directly from the backend.

(We had some guys from EMC come out on 03-13-2009 to setup our NS20 and I mentioned my conclusion about the possible performance hit when using a LUN that is comprised of a LUN. The one EMC guy said that performance is actually better because more spindles are used. If you look at it that way, it makes sense. My original thinking was that the extra layer of processing would slow things down. But I guess one layer wouldn’t be an issue. But I’m sure that eventually the law of diminishing returns would set in if the layers of LUNs got too deep.)

b. The Celerra comes pre-configured with one default storage pool named clar_r5_peformance. That pool uses Fibre Channel disks in a 4+1 RAID 5 configuration. So when it comes down to it, the Celerra uses RAID 5. Note that this is the default configuration, so it can be customized. But most customers will probably leave it alone. If a customer really wanted to customize the drive configuration, he’d probably be better off getting a dedicated backend array.

c. After looking at points a and b, the logical conclusion is that iSCSI and FC performance for Celerra is probably not as good as connecting directly to a backend array because of the layers of abstraction. I don’t think there are any performance issues with CIFS or NFS though, since those are not native services provided by the backend arrays and hence there would always be a middle layer between the array and CIFS/NFS clients.

Per the course Student Guide, this is how disks are presented to the Celerra:

<---------- Backend Storage ----->    <----------------------------------- Celerra ----------------------------------->
Drives/Spindles --> LUNs/Hypers --> Disk Volumes --> Slices --> Stripe Volumes --> Metavolumes --> File System

3.) Regarding Virtual Data Movers, this feature allows you to make your CIFS server configuration portable so that you can move it to another physical DM after a disaster. But the VDM only contains the CIFS configuration, not the CIFS share data (file system). So you need to coordinate the replication (via Celerra Replicator) of the CIFS file system along with the VDM. The VDM idea sounds great, but really, it’s just a very small part of a DR plan.

4.) Regarding creation of CIFS servers, the course materials didn’t mention this, but there is a CIFS management MMC on the Celerra CD that makes creating and managing CIFS servers a lot more streamlined. Some classmates had mentioned that, which is how I found out about it. I have no idea why that was left out of the course materials. Creating a CIFS server and share manually requires several steps; the steps are not complicated, but the whole process itself seems more complicated than necessary.

5.) Regarding CIFS server and share management, you can use the Windows Computer Management MMC to manage shares and share/folder permissions. CIFS servers also support a limited number of GPO settings, which is pretty cool.

CIFS requires the Usermapper service (or some other type of user mapping) to be running because Celerra is a LINUX-based system and uses UIDs and GIDs (User IDs and Group IDs) instead of Windows SIDs. Usermapper runs on the Data Mover and maps UIDs/GIDs to SIDs.

6.) HomeDir (automated home directory feature) was not well explained in the course and I still don’t see why anyone would use it. I honestly do not see a huge benefit of HomeDir over using native Windows tools. Perhaps if the course material explained it better I might see more benefits, but with what was presented, it seems more complicated than it’s worth.

One thing that was not clearly mentioned was that all users would get the same UNC, \\your-dm\HOME, their home drive. That’s correct—all users would have the same home directory UNC. On the backend, Celerra knows which user is authenticating and will point \\your-dm\HOME to the user’s actual CIFS folder. That’s why user mapping must be working properly.

So if every user has the same share name, then the storage admin must browse several layers under \\your-dm\c$ to get to the actual data folder for the users. I just don’t see how this is beneficial.

7.) CAVA (Celera AntiVirus Agent) doesn’t seem like a very efficient setup. Basically what happens is that you need a dedicated Windows server, which has anti-virus software, to scan CIFS server files on the Celerra. The scan takes place across the network; the Celerra gives the CAVA server the UNC and it connects to the file and scans it over the network. I’m not sure how competitors such as NetApp handle something like this, but I can’t imagine they’d be any less efficient.

8.) File extension filtering works at the DM or CIFS share level and allows you to put certain restrictions on file types, such as the ability to save them to the share, who can access then, etc. As far as who can access certain files, you’d want to do this through file permission, not FEF. FEF only filters files by their extension, not content. So a crafty user could rename all .mp3 files to .abc to bypass filtering of .mp3 files.

The way FEF works is a bit creative; you’d use Notepad to create an empty file in the C$\.filefilter\ folder on the DM. You’d name the file itself with the extension to filter along with some other optional parameters, e.g., ppt@some-share to filter .ppt files on the some-share share. Then you go to the ACL of that file and give specific users/groups permission to allow or deny them certain actions with that file extension type on the share.

9.) The Celerra seems to have good support for network fault tolerance. It supports aggregate and virtual network ports (devices) using combinations of EtherChannel (Cisco proprietary), LACP (Link Aggregation Control Protocol, IEEE 802.3ad) and FSN (Fail Safe Network). FSN is specific to EMC and can use combinations of physical and virtual ports, and can also use ports on different switches without requiring the switches to support FSN.

10.) The default setting for iSCSI is not to use any authentication for the initiator; you have to explicitly enable CHAP authentication. Also, I’m not sure if the iSCSI protocol in general is very secure; nothing about encryption of iSCSI packets was mentioned.

11.) Delegation of administration is similar to MS SQL Server—you map AD groups to local administrative roles on the Control Station (CS). This was pretty easy to set up in the lab. One thing with roles is that they’re not enforced in the CLI—I’m not sure if that means that a user who is assigned to a role can’t log on to the CS via telnet, or if the user can’t run CLI commands within the Celerra Manager session.

12.) The auditing feature seems to be pretty flexible and integrates well with the Windows auditing model.

13.) The SnapSure feature seems pretty good and it integrates with the Previous Versions tab in Windows. I did find it confusing that EMC wasn’t consistent with the naming of the different components. The Checkpoint is the actual point-in-time snapshot which gets saved to the SavVol (the volume for storing Checkpoints). So why didn’t they just name these components “Snapshot” and “SnapshotVol” or something similar?

These are the main components that work with SnapSure:

a. The production file system (PFS).
b. A bitmap of changed blocks on the PFS. This bitmap gets reused with each new snapshot/checkpoint.
c. A blockmap, which keeps track of where each changed block was written to in the SavVol. Each snapshot/checkpoint has an associated blockmap.
d. The SavVol is the volume that holds the changed blocks.

New in v5.6 is the Writeable Snaps feature which allows snapshots to be writeable so that you can use them for testing and development. There are a several limitations with Writeable Snaps, so read up on them before using.

14.) Celerra Replicator V2 is based on SnapSure technology and is asynchronous (not near real-time). You can replicate different types of objects, e.g., VDMs, file systems, and iSCSI LUNs. When configuring an interconnect (communication path) for remote replication, it defaults to using 100% of the network capacity at all times. This could obviously choke the network and cause issues for other applications. You can configure schedules for bandwidth throttling, so be sure to do that.

CR replicates between DMs. The destination object can only be accessed as read-only. You can replicate using the following types:
1. Loopback: within the same Data Mover. Other than for testing, I don’t see this making a lot of sense for disaster recovery since the replication copy is not offsite.
2. Locally:  between DMs within the same Celerra cabinet. Again, I don’t see this making a lot of sense other than for testing.
3. Remote: between DMs in different Celerras. Note that the different Celerras don’t need to be in different locations; they could be in the same data center. This makes the most sense for disaster recovery if the Celerras are in different physical locations.

CR uses two checkpoints for each replication session. Checkpoint 1 is used for the baseline and checkpoint 2 is used for the deltas.

One thing that was just glossed over was how to properly set up replication of VDMs and the CIFs file systems. I didn’t see anything in CR that allows you to point to a VDM and have it automatically configure replication for the VDM and all of its CIFS file systems. Maybe you can do that in the Replication Wizard, but there was only one slide about that, which just stated that wizards were available for CR.

Note that even though CR can replicate data on volumes used by applications, it doesn’t mean that the replicated copy will be consistent. You need to use EMC Replication Manager (an add-on) to ensure that application data such as SQL and Exchange databases are in a consistent state before replication. This wasn’t mentioned clearly in the course, but I learned about it when I was evaluating the Celerra features before the course.

Leave a Reply

*