Thursday, April 23, 2009

Linux Lunacy

Okay, so since I got my brand new PC, I've had a few spare parts to play with. I went to rebuild a system with some odds and ends using my previous board, only to have it mysteriously shut down and start screaming this high pitched noise. Further investigation (read; pulled the entire sucker apart until it was nothing) showed the CPU became unseated from its socket. I assume heat did it, because none of the clamps had come off and the heatsink was firmly attached. Unfortunately, the CPU is fried, so I dug out an even more ancient board, my old AMD K7 with a Duron 850MHz.

After messing around a little, I finally settled on using Debian; I was looking to set up a media center in the lounge room for my mother to watch videos on (as she's using her ASUS Eee-PC with its tiny screen at the moment), but I also wanted a system I could (ab)use remotely - Linux seemed to fit the trick nicely - or so I thought.

I have literally spent the last three days, two migraines, and generally making myself feel crappier than I am - not really over installing Debian itself, the whole process of putting a base system on there went flawlessly after I realised GRUB's "Error 18" meant the board's BIOS couldn't handle a boot partition of more than 8GB. I guess the board really is old, and to be honest I am surprised it still works, it's been collecting dust for over six years now. My problem lay in trying to get the 3D accelerator (an ATI Radeon 9550) going, which was orphaned by the dead system.

Now, I'm no Linux newbie, I've used it quite competently in the past (to the point where I even had fun playing with Gentoo at one stage), but the error I was getting had me stumped, X.org would boot, but all I would get is this:
(**) fglrx(0): using built in AGPGART module: no
(II) fglrx(0): [pci] find AGP GART
(EE) fglrx(0): [agp] Failed to get AGP mode!
(EE) fglrx(0): cannot init AGP
(II) fglrx(0): driver needs X.org 7.1.x.y with x.y >= 0.0
(WW) fglrx(0): could not detect X server version (query_status=-1)
(EE) fglrx(0): atiddxDriScreenInit failed, GPS not been initialized.
(WW) fglrx(0): ***********************************************
(WW) fglrx(0): * DRI initialization failed! *
(WW) fglrx(0): * (maybe driver kernel module missing or bad) *
(WW) fglrx(0): * 2D acceleraton available (MMIO) *
(WW) fglrx(0): * no 3D acceleration available *
(WW) fglrx(0): ********************************************* *

No amount of messing around with the Proprietary or Open Source drivers would do anything to fix this, something was stopping it from getting access to the AGP card. My friend, Hirato, even decided to give it a shot by logging in remotely and playing around (I suspect to stop me from getting dismayed at Linux completely), but even he gave up after a few hours, it seemed unfixable. Even Google was coming up pretty useless here, and us geeks basically use it as our #1 information resource on the internet. Most people were having the exact same problem, but there was never any real solution from others, and I guess anyone experiencing the problem either gave up or never posted their success story; so that's what I'm trying to do here now to fix that.

So, I was about to give up, as my headache was starting to return from overexertion, when I decided to whack one more last-ditch search into Google: "linux agp amd k7". The results themselves didn't look too promising, apart from the top two which were mailing list threads (on an almost completely irrelevant subject), but I decided to flip through the threads nevertheless. There was a bunch of talk about kernel crap and debug calls, but the end result of those discussions ended in one clear thing; EDAC on an AMD K7 running Linux steals system resources and doesn't release them to other modules, like "agpgart".

To me, it seemed a bit far fetched, but I decided to give it a go anyway - I had nothing left to lose. I ran a "lsmod | grep edac", two results; "amd76x_edac" and "edac_core". Okay, so I poked around /etc a little and found "/etc/modprobe.d/blacklist" to the bottom of which I placed these two lines:
blacklist amd76x_edac
blacklist edac_core

I then rebooted the machine just to be sure everything was back to running how it should be, as I had been messing around alot, but I'm pretty sure I could've just run "rmmod" on the modules if things had have been clean. To my surprise, I ran "startx" and the damn thing worked.
(II) fglrx(0): AGP card detected
(**) fglrx(0): using built in AGPGART module: no
(II) fglrx(0): [pci] find AGP GART
(II) fglrx(0): [agp] AGP protocol is enabled for graphics board. (cmd=0x0f000314)
(II) fglrx(0): [agp] graphics chipset has AGP v2.0
(II) fglrx(0): DRI initialization successfull!
(II) fglrx(0): Acceleration enabled

Honestly, I did a dance in the living room right then and there. Three days of messing around and it was something as simple as removing a couple of faulty modules, I really couldn't believe it. It's this type of obscure annoyance that made me give up on Linux in the first place, but I really have to say I am overjoyed to have found the solution which nobody else could seem to correlate. So if you're having trouble getting an AGP card going in an old system (namely the K7 series motherboards), try turfing out the "edac" modules - it worked for me.

And that's my success story on another adventure in Linux Lunacy, hopefully you Google'rs will come across this entry in the years to come, find it answers your question and be forever grateful that you never got the headache I had to endure. Happy 3D acceleration to you.

1 comment:

Anonymous said...

i love you