No more free hosting!

Posted in Random on November 12, 2009 by lucywitdiam0nds

Hi all,

Seeing as I have taken quite an interest in this, I have decided to forget about just alright free hosting on wordpress.com, and I have moved to a full on host :)

The new website is:

http://talesofacoldadmin.com/

 

Not too much different, but a LOT more stuff to play with for me :)

I hope you guys continue to read, and enjoy!

Its very nice, I like very much

“My Cluster”…Making RAM faster with every simulation (hopefully).

Posted in Random on November 11, 2009 by lucywitdiam0nds

I use both the words “My” and “Cluster” in quite a loose sense. Mainly because it’s not mine, and it’s not quite a cluster. It is owned by Brian Davis, one of my professors here at tech, but administrated by me, funded by grant money for a project that he got a few years ago.

Essentially the grant (and simulations) are to find a better way to organize memory. If you think about the way that memory is in the conventional sense, you have programs interacting with first the CPU cache, then to the RAM, then to the HD as a ‘last resort’. Got it. Right.

What he’s aiming to do is essentially provide another layer between the RAM and the HD (and in the future between the RAM and the CPU cache) that essentially has an algorithm to organize it, both physically (reducing copper wire latency) and programatically (in page size, as well as frequency of use). I’m not too sure on the specifics past that, and that was only one branch of what he’s actually researching, but it struck me as quite interesting!

I’m hoping at some point to get job management going properly, as well as “actual clustering” going. Right now people just ssh into the head, and have a trust setup between head and nodes, so they don’t have to login after they’ve authenticated through the head. The nodes are pretty easy to start and compile code on, so it works for all the simulations, at least for now.

Every node is running Redhat Enterprise Linux 5, and the simulations are written in C. I’ve got everything setup to kickstart whenever I need it. Take tonight for example, my nodes have been mysteriously dying, and I have a feeling that someone’s code corrupted something in the T3 module, and they just kept trying to run it on all the nodes.

Thankfully I have set it up so its nice and easy to wipe everything clean. Thank god for kickstart, this is a copy of mine :

http://pastebin.com/f6c745c08

Running through my kickstart, you see that it puts every node on the same NIS domain (btdpool.ee.mtu.edu), and uses an ftp server to grab all the necessary packages and files. No X windows, as there isn’t a need for it. Generates the swap partition depending on memory size, and fills the rest of the partition with ext3 (forget LVM!). Installs all the packages it really needs, and on the post install it does the following:

Changes the passwd binary for all nodes to yppasswd, so when people try to change their password, it propagates properly through the NIS domain.

Adds “+::::::” to /etc/passwd – this basically states that the rest of the file should be looked up in “Yellow pages” (or the NIS server, whichever you prefer :-P )

Adds “+:::” to /etc/group, and does the same exact thing as the previous one, but with groups instead of passwords.

Puts an entry in /etc/fstab to connect to the NFS share I have running for everyone’s homedrive, allowing SSH trusts to remain in place no matter how many times I wipe them :).

These things seem relatively simple…now. Trust me, it took me many many hours to figure out exactly what I needed to do to get everything working like this. Now all I have to do to “refresh” the nodes is PXE boot them, and reboot after they’re done. Since nagios only checks SSH and ICMP connectivity to all nodes, it will still be fully functional.

All in all, it has been a huge learning experience for me, getting to know a lot about RHEL administration. It’s been a lot of fun, and I’ll get into all the scripting that I did to make everything work flawlessly :).

I’ve got a lot of things happening using cron and bash scripting, but I don’t feel like getting into that right now.

Also look at the blast from the past that I found today at work!

Top

Right

Front

Left

Back

I saw it sitting there and literally burst out laughing. There was a huge cart of old crap sitting on the left when I walked in and I spent about 20 min digging through it for nostalgic things like this :).

LOPSA

Posted in Random on November 11, 2009 by lucywitdiam0nds

Horay! I just joined the League of Professional System Administrators.I have a feeling this is going to be a good outlet for both learning as well as finding contacts in the future. I joined on the advice of Matt Simmons, so thanks for letting me know about it!

If you’re interested in System Administration, you should check it out, its only 25$ to join if you’re a student :).

http://lopsa.org/

 

Also I joined both of my zeroshell tutorials together into a page, check it out!

http://talesofacoldadmin.wordpress.com/migrating-from-linksys-linux-with-vpn/

 

Windows 7 Driver Signing = Headache, but its for a reason?

Posted in Random on November 10, 2009 by lucywitdiam0nds

Don’t get me wrong, I understand why driver signing is necessary. I do. But I gained a higher appriciation for it when I mentioned something about it to my professor. Some of his prior students actually work for Microsoft currently, and one of them had discussed this with him at some point.

It all comes back to Apple. I know it sounds a bit werid, but they inadverntently forced microsoft to adhere to stricter conventions to improve the overall stability of their OS.

Something like 90% of all windows based crashes (blue screens of death) are caused by 3rd party drivers functioning improperly. I can see it being the issue because there is very limited checking when it comes to drivers in the first place. Drivers are written mostly in assembly, which means to have something to test if it is going to interact with the windows kernel fine is a damn near impossible task.

This is where Apple actually does something right. They control every aspect of the hardware and software with their computers, which means that all their drivers are professionally written, and tested throughout for stability. When I came to that realization I felt like someone had slapped me in the brain. Apple…did something RIGHT?
I’ve been a longtime advocate for Windows and Unix-based operating systems (EXCLUDING Apple’s OS) for obvious reasons, and seeing that Windows 7 is taking a leaf out of Apple’s book is a bit weird for me. Its still a pain in the ass, as with most Apple based decisions, but it is almost necessary for Microsoft to improve the stability of Windows.

I understand Microsoft’s mentality on this, but it should still acceptable to use unsigned drivers when you really want to, perhaps with a warning that it may cause stability issues.
Namely – I want to be able to use my hax0red Xbox controller damn it! (And yes I know that the Xbox used USB with a different port configuration :-P) it is the one thing that doesn’t (and will never) work on 7 unless I reboot and go to “disable driver signing”, but when I restart it re-enables it :(. Sure there are ways to edit your bootloader to always boot into this special debug mode, but wtf. WHY?

http://www.killertechtips.com/2009/05/05/disable-driver-signing-windows-7/

The only reason for this is the fact that you have to have a valid certificate from VeriSign ($300-$400 a year!)

http://en.wikipedia.org/wiki/Criticism_of_Windows_Vista#Driver_signing_requirement

Shenanagins I say!

Zeroshell: Part 2 – VPN & NAT

Posted in Unix on November 9, 2009 by lucywitdiam0nds

Firstly let me just say thank you to everyone who’s been reading! 3 days and 8,330 hits! I really didn’t know so many people cared about what I have to say!

(In the time it took me to write this article my hits jumped up to 11,114 hits! Holy crap!)

So coming back to my previous post:

Getting DDWRT to play nice with Zeroshell

After getting everything working properly with routing set between interfaces, we can utilize one of the coolest functions of Zeroshell, VPN access (almost) out of the box.

Getting VPN to work:

title

For those of you who don’t know what VPNs are, it stands for Virtual Private Networking. It is simply a method of securely connecting to your home network, as if you were physically wired into it, when in fact you are connecting over the internet. This particular method wraps the VPN in 2 layers of security – X.509 Cert and a Kerberos server. Plenty of encryption for an enterprise solution (small scale of course). Keep in mind that you can set this VPN server’s authentication in one of 3 ways: X.509 certificate, Username/Password, or both. In this case, we’re using both.

You need to make a couple decisions though at first.

  • Do you want to have just one user on your VPN?
  • Do you want multiple users with the same username? (not necessarily recommended)
  • What authentication do you want to set up? (credentials + X509 is what i’m using in this)

To start go into the users section on the left, select the admin user and click the X509 tab at the top, which will give you lots of information about that particular user (and any subsequent users as well)

admin

We need to export this certificate and place it in a good location. You should name it to admin.pem if it isn’t already.

This is only one of the certs we need though. We need to the something called the “Trusted CA”. CA stands for Certificate Authority. In the world of certificates, this is a trusted source. The administrative entity that is considered ‘always valid and all knowing’, which in our case is our Zeroshell install. As long as the CA says its fine, any services using it will trust it, much like SSL certs.

Now we need get a copy of our trusted CA and enable the actual VPN functionality of our Zeroshell. What you want to do is click on the Trusted CAs button under the X.509 Configuration, which will spawn a window. Export it as a .pem, and put it somewhere safe.

CANow to enable VPN functionality we need to click the ‘enable’ box, and click save. Complicated, right?

vpn1

Next we need to get a copy of the OpenVPN client. Since I’m using Windows 7 i had to get a RC, with a proper signed driver (shakes fist at microsoft). Also make sure you run the installer in compatibility mode for windows vista and run it as administrator (what a pain in the ass):

http://openvpn.net/release/openvpn-2.1_rc19-install.exe

The newest installer can be found here, but it didn’t seem to work for me when I installed it :(

http://openvpn.net/release/openvpn-2.1_rc20-install.exe

After getting it all installed and whatnot, move CA.pem and admin.pem to:

C:\Program Files\OpenVPN\config

or in my case (64 bit machines)

C:\Program Files (x86)\OpenVPN\config

Now get into that directory and create a file named config.ovpn and copy my config from:

http://pastebin.com/f6c913fcd (edited from the Zeroshell OpenVPN configuration – thanks guys!)

Now just edit the first line to hold whatever your server’s IP is, and proper port if necessary. You probably want to decrease the verbosity to 3 after you’re sure everything works as well. You should now be able to start the OpenVPN gui, which will start a taskbar icon.

Right click and go to connect, and it should prompt you for a username and password, log in with the admin user, unless you have other users to use.

openvpn

You should now be successfully connected to your VPN in a very secure manner!

Getting NAT working:

Alright, for those of you who don’t know what NAT is, it stands for Network Address Translation. This is a fancy way of saying port forwarding, which does exactly what it sounds like it’s doing. It forwards a port on one of your clients to the outside world, so that anyone looking at your router from the internet will ‘see’ into your network.

First about my interfaces:

ETH00 is my internal network 10.0.0.0/24

ETH01 is my WAN port, connecting me to the internet

So the first entry in my nat table:

I have an FTP server running on port 21 of my laptop which I want to be able to access from anywhere. In order for me to make it so I can connect to my external IP on some port and talk to my laptop, I need to tell my router what to do.

If you go into the router section on the left, and click the virtual servers tab at the top after that, which will spawn a window for you.

For whatever reason you need to specify your internal interface (the one that takes care of your internal network), which in my case is ETH00 to any IP address (you can restrict services like remote desktop to a single ip, or a range of IP’s if you want.

The local port box specifys the port you want to connect to from the outside

The remote IP is the local IP address of the machine you want to forward to the outside

The remote port is the local port running on said ip address

In this example I have an FTP server running on port 21 on my local network, which is getting forwarded to port 65534, so in order for me to connect to my laptop I need to connect to my.external.ip.address:65534 to talk to my laptop properly

virtserver

The same goes for all other services, for bittorrent use the same port for local and remote ports, and it’ll be a 1:1 portmap to your local client. Tis truly handy :)

The proofs in the pudding:

C:\Users\P0rT_Smoke>nmap -T4 -sV -p65534 71.xx.222.63

Starting Nmap 5.00 ( http://nmap.org ) at 2009-11-09 22:12 Pacific Standard Time

Interesting ports on 71-xx-222-63.dhcp.mrqt.mi.charter.com (71.xx.222.63):
PORT      STATE SERVICE VERSION
65534/tcp open  ftp     FileZilla ftpd 0.9.31 beta
Service Info: OS: Windows

Service detection performed. Please report any incorrect results at http://nmap.
org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 2.42 seconds

So hopefully now you have an awesome network setup allowing you to securly access any local resources you may  need, as well as foward services you can afford to have ‘on the wire’.

Nmap is your friend!

See, even Trinity used nMap to save all of humanity. It can work for you too!

Fiber optic…circuts?

Posted in Unix on November 9, 2009 by lucywitdiam0nds

Today I was leaving class, and I started a conversation with someone who was in quite a hurry to get to a team meeting, so I inquired what about.

He said fiber optics, and I figured, you know alright, just another fiber optic whatever. So I dug a bit deeper, and it turns out he’s working on getting fiber optic integrated circuts to work. Let me say that again FIBER OPTIC CIRCUTS!!!

As soon as he said that I immediatly had 10 ideas in my head, and we branching into our seperate directions, and this interested me so much that I walked away from the elevator that I was waiting for just to talk with him about it.

As any EE major knows, the reason that we use serial communication as opposed to parallel communication is because of the fact that we don’t have to worry about syncing the clock (the force telling your circuts on what interval to do something).

 

Data used to be transmitted like this:

0-=============-0

1-=============-1

2-=============-2

3-=============-3

4-=============-4

5-=============-5

6-=============-6

7-=============-7

This is known as a parallel transfer because, as you can see data lines travels in parallel with one another.

 

As IC’s get smaller and more efficient, the distance of the copper wire starts  to effect the latency of copper wiring. In short, the minuscule length differences between 6 and 7 would cause their signal to be just the slightest bit off.

This used to be fine when we had a long clock period, but when you turn the clock frequency up (and thus increasing the speed of your circuit) there is a major problem of getting the signals to sync properly.

Then someone figured out that you could multiply your speed much past that with parallel connectors using a serial connector with an embeded clock signal.

This allows for us to turn the clock WAY up and get a major performance boost, as we no longer have to deal with different length connectors throwing our main clock off.

Essentially what this boils down to is the fact that we can now take all these super-fast serial lines, and run them in parallel again, giving us a multiplaction of throughput, thus shattering yet another bottleneck in the computing industry. Amongst other things.

How about not having to worry about your circut “shorting out”? We can literally take this theoretical circit and have it water-proof in nature, because we’re using lasers (which still conduct under water, not to say they won’t be refracted though)

Needless to say, I’ve been thinking about this for a while now, and I am still just stunned at the implications.

Imagine your next iPod having a few million of these in it :-P

fiber-optic-fiber

Then the internet will REALLY be made of tubes

I just can’t wait to see how they actually implament this into circuts. Interesting!

Nodes mysteriously dying?

Posted in Unix on November 9, 2009 by lucywitdiam0nds

So i’ve got 3 nodes that have randomly decided to crash over the past few days, and it seems like a daily occurrence now, waking up to have 70 emails in my inbox (I have nagios email me every 5 min there’s a problem until its fixed). Now I just need to find the time to actually sit down and diagnose these things. I am half-tempted to just re-image everything as there should be nothing on the computational nodes themselves, just the NFS share of /home, which obviously won’t be active when I PXE boot it.

Id like to figure out what exactly is going on with them before I just nuke everything, although for time’s sake, I might end up doing just that.

Oh well, its probably just someone’s crappy code breaking each node. At least its only been one at a time, and it doesn’t look like there have been any jobs submitted in a while.

I’ll probably end up pulling that apart tonight (if I find time between editing my first and second posts), although the biggest pain is the fact that I almost have to be there to see what exactly is going on.

::shrugs:: time for some learnin!

nagios

Stopping in to check on the nodes, I saw the following two screens for dmesg | more :

IMG_0202

It looks as though eth0 is having a hardware fault, and gets stuck with MAC_TX_MODE=fffffff although i’m not sure what that status means exactly, i’ll figure it out though, as this problem is whats been affecting all of my nodes, and if people can’t connect to them, obviously they can’t be used for computations :(

IMG_0203

I’ll post more when I actually know what’s going on. Open to suggestions though!

Hopefully I’ll be able to figure this out without wiping all my nodes clean. That would be kind of a pain in the butt.

This is a pretty bitchin rendition of panic.c (kernel panic)

So it seems like a bug in the tg3 driver, which drives the NIC. Although my money is still on a hardware failure that is tiny but there. I sincerely hope I’m wrong, but I’ve diagnosed a lot of failures like this in the past, and its generally turned out to be something that is just going to get worse :(.

Follow

Get every new post delivered to your Inbox.