[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

meeting minutes, Jan 4th 2006




UMCE Linux meeting
January 4th, 2006
====================
Attendance: Sean Sweda, Miguel Bermudez, Tony Winkler, Liam Hoekenga, Steve Simmons, Willie Northway, Katarina Lukaszewicz, Charles Stuart, and Michael Garrison


Agenda:
- announcements / review / additions?
- openafs 1.4
- hardware
- spontaneous FS
- radmind 1.51
- participation
- radmind tools
- next meeting


announcements: --------------- how many emails did you receive for the "leap second"?

- openafs 1.4
---------------
There were some issues combining 2 versions of transcripts, so we should make sure that command files are clean.
It's recommended that everyone switch over to using this, particularly if you want large-file support. The webteam has some tweaked settings. The defaults aren't bad, but we could increase it for large files. You'll see an improvement over 1.2.4


This is mostly resource allocation, and not necessarily cache size.
Sean and Michael are going to work together.

- hardware
---------------
We have the new 64 bit machines (AMD-based) in from Atipa. Web gets 6, gpcc gets 2.


The demo unit is in for the next 1u order. It has on-board SCSI, and on-board RAID card.
The 1u order will be going out by the end of January, if all goes well. We're going to evaluate various SATA RAID cards to decrease the support nightmare of too many different subsystems. Another vendors will also be supplying a demo unit; they're currently having raid array issues and will ship once those are resolved.


This is for RAIDing internally in the 1u. We're thinking of buying 4u enclosures, and getting cards that will span. Ideally, we could use the same monitoring for all of our SATA RAIDs. General agreement that raid support is a nightmare.

The Suns were nice for this, due to the homogeneity. Do we look better having a vendor to blame?
We'll always be at the mercy of the vendor, and no 2 vendors will agree. Either we end up with a Frankenstein that we can replicate, or a whole bunch of Frankensteins with some vendor's label on it. If we all have the same Frankenstein, it won't matter, since every piece will be obsolete within 6 months.


RAID cards aren't a commodity issue just yet. However, the price on these cards is low enough to give that perception.

Unfortunately, we're not finding better options than what we've bought. Can we talk directly to the RAID over PCI? We need to find some manufacturer that meets our specifications. If we can find that, we should be putting it directly onto our RFPs. We're always stuck with what the vendor wanted to support instead of what we want to support, on our scale.

We can't really go with a single vendor. We have to go with the lowest priced vendor who meets the specs.

Perhaps we should send out an RFP specifying the card that we want. Sean doesn't care who builds the machine, inserts the card, etc. Just as long as all of our support and monitoring systems are the same. We can do the homework and figure out our preferred RAID card.

(long discussion about hardware compatibility vendor issues, support software and compatibility)

We don't have the time to do this research... Sean is willing to participate in the purchasing process, and to do the homework, if he feels that it will pay off in the future.

- spontaneous FS corruptim
---------------
The file system corruption is causing radmind to react in a humorous way. But the core problem is the file system, not radmind. Sean says that he's seeing the problem in ext2 to the point that they're not using ext2 any more. At least part of the problem is the clean flag being set even if an unclean shutdown occurred. MikeG sees the problem only when having to power cycle a host.


Perhaps we should investigate always running fsck on boot for systems with ext2, perhaps even ext3. Sean strongly advocates moving from ext2/3 to reiser. The IMAP folks really like reiser, reporting good performance and crashworthiness. But other folks have seen some performance issues w/Reiser, Apparently mount options can have an effect on this (notail). Nor does one want to run the AFS cache under reiser.

- radmind 1.51
---------------
Sean has built a 1.51 transcript and encourages folks to shift to it. The 1.51 release is mostly bugfix, plus their update of the rash script. No man page yet, read the source. Will requested a few changes so we only see the diffs; Sean says that's in the 1.51 release already (see the shared folder). Postapply and preapply work, but you only get prompted if they exist (duh!) :-) Miguel is writing kexpand so you can 'blow up' command files and see the details.


- participation
---------------
Kevin will be here mostly Tues, Thu and Fri (60%). He will spend a lot of time in meetings. Let's see if we can move our meeting to Friday so Kevin can attend. Some sentiment to earlier in afternoon - 1:30 seems popular. Need to check on room availability.


The next meeting will take place on Friday, January 27th at 1:30. Kevin's schedule conflicts with this, but he said that he'll try to change that meeting time.

How do we get others to join these meetings, like sites?

radmind tools
-------------
progress is being made. The nanny script is nearly ready to be released. the first run-through will need to be manual, subsequent runs will become automatic with opt-in.


- next meeting
---------------
Steve will facilitate
Miguel will take minutes