goto section page goto youngmonkey main page make contact, e-mail


Improving Your Amiga's Performance
Advertising Space Available

Author:dhomas trenn
Published by:NewTekniques magazine (US)
Date:February 1999

Optimization for Accelerators
The Tale of the Oxyron, the Cyber and the Matthias

Installing an accelerator is probably the best way to increase the speed of your Amiga applications, but it is not the last. If you already have a 68040/060 accelerator or expect to be getting one, you could possibly make your system even faster.

Many programs that require intensive numeric calculations (such as image processing, raytracing, etc), particularly older ones, are programmed to take advantage of a 68881/82 FPU (math coprocessor) so they will perform operations faster. The problem though, is that 040/060s do not implement all of the functions used by those coprocessors.

When an 040/060 encounters an instruction it does not support, it generates an interruption which causes the Amiga to stop everything it is doing and replace the unknown function call - usually with slow emulation routines within the 68040/060.library in LIBS:. If these interruptions occur often enough, the system can become very slow - which may become apparent by an unresponsive mouse pointer. Fortunately, there are a few solutions to these problems as well as some performance benefits to using them.

Among the various utilities that are available for Phase5 accelerators is a program called CyberPatcher (????). Starting it is as simple as double clicking its icon or dragging it into your WBStartUp directory and rebooting. Once active, this program watches for illegal processor instructions and replaces the 68040/060.library emulations with more efficient functions; the result of which is a faster system and minimization of sluggish behaviours.

Achim Koyen's (New Generation Software) Oxyron Patcher, more commonly known as OxyPatcher, provides similar functionality to that of CyberPatcher and adds various optimization and configuration options, the ability to monitor its activity, and support for most accelerators. In addition to the main program going in WBStartUp, it is also necessary to run an initialization program from your startup-sequence. When booting, this startup program will install itself in memory and then reset your system. On the test system this added about twenty seconds to the booting process. Like CyberPatcher, illegal instructions are trapped and library emulation routines are replaced with much faster functions. OxyPatcher is commercially available for about US$30.

Matthias Henze (Cyberdyne Systems) takes a different approach, which involves replacing the standard math libraries with versions that use only valid 040/060 instructions. Because interruptions are avoided and more efficient routines are used, there can be a drastic improvement in performance. The HSMathLibs (www.hsmathlibs.de/index_e.html) are programmed in assembler and available as optimized versions for the 68040 and 68060 to achieve maximum performance. These libraries are shareware and priced at a very reasonable US$7 - which includes future updates, downloadable from the support website. A special 68881/82 release, for systems without an 040/060, is currently being beta tested and should be available soon.

Many users report that the best results are achieved when using HSMathLibs in combination with OxyPatcher or Cyberpatcher. Why is that? This can be a bit confusing and requires some understanding about how things actually work behind the scenes.

When a program is compiled, various options can be set to force the program to expect a particular CPU (680x0), an FPU (6888x) or to use the Amiga math libraries. This is why you will often see various versions of a program (Clouds_060, Clouds_030, etc.). It is important to note, that If you start a program compiled for an 060 on a system with an 030, the computer will likely crash. Newer CPUs are mostly backwards compatible, though, so you can usually run an 030 or 040 program with an 060.

The Amiga's math libraries were designed to recognize if an FPU is present and if so, to use its faster capabilities. If an FPU is not present the math still has to be done, so functions were included within the libraries themselves to do it. With that in mind, many programs were compiled to use the math libraries, relying on them to determine the faster way to process the calculations. The benefit was that only one version of a program was needed - perhaps, requiring only one disk for distribution instead of two.

The drawback is that it takes time to determine if an FPU exists. Although it is only a very small amount, this increases the time to do a calculation. The HSMathLibs are compiled specifically for each CPU, so some of their faster speed comes from not having to determine what hardware is present. But, be careful, if you try to use the 060 versions of the HSMathLibs on an 030 system, can you guess what will happen? CRASH!

When systems with an FPU became more common, many developers made FPU compiled programs the standard - when fast math was important. These applications use the FPU directly and so are faster than they would be by going through the math libraries. Unless of course, you do not have an FPU in you system.

Now, some programs are available that claim to be compiled specifically for the 040 and/or 060. If programmed properly, these applications should only use valid 040/060 instructions - and should use them directly, resulting in the fastest performance possible. My testing, however, showed that this was not always the case.

I found several programs that claimed to be compiled specifically for the 060, that were still being trapped by OxyPatcher. According to OxyPatcher's documentation, the only time that should happen is when a program uses invalid instructions. So, clearly, something must be amiss.

These days, there is a little of everything out there. So if you combine OxyPatcher or Cyberpatcher with the HSMathLibs, you cover all the possibilities. If a program tries to use an FPU, OxyPatcher steps in and does the job; if the math libraries are accessed, HSMathLibs takes over. In each case, faster performance is the result.

Note that because OxyPatcher and CyberPatcher use the same method of operation, they should not be used together. Each would likely confuse the other and probably send your system spiraling to GURU doom.

See Test Results below for more details and the results of several speed tests.

Faster Kickstart ROMs
Just about everything your Amiga does, requires access to functions contained within its Kickstart ROMs. Typically, reading these ROMs is much slower than reading from the Fast RAM that can be installed on most accelerators.

Several freely available programs (such as QuickROM (aminet: util/sys/QuickROM.lha) and ROM2Fast) take advantage of this fact, as well as the Amiga's ability to relocate its Kickstart into memory.

Nic Wilson's KickSpeed tester reported that Kickstart could be read 3.6 times faster, on our test system, when using ROM2Fast. The QuickROM documentation suggests that, on some systems, results can be even better.

Keep in mind that copying the Kickstart ROM to RAM, means that you will have less memory available for your applications to use. So if you often run out, this might not be such a great advantage. If you have lots of memory, this speed improvement could be a real bonus!

Lower Monitor Power
If you frequented the arcades when you were younger, or still do, you have probably seen the effects of persistent display, often called screen burn-in. This is also commonly seen on monitors used with computerized point-of-sale systems.

If a monitor displays the same thing for long periods of time, that display can become "burnt" into the monitor. Once that happens, even with the monitor turned off you can see a strange, dark, ghost-like image.

The problem was first addressed by using a screen blanker, a program that ran in the background and blacked out your display after a preset period of time - a seemingly reasonable and sensible solution. Somehow, people got it into their heads that screen blankers should display stunning fractal graphics and entertaining cartoons - a concept I have never quite understood.

Environmental and health concerns have become prominent in discussions about computer hardware and issues such as excessive power usage are big topics. The result of which, is that most newer monitors now support DPMS - Display Power Management Signaling. This system allows the computer to tell the monitor to go into a lower level power state - which also means the display is safe from burn-in. Finally, something that makes sense, again.

If you use CyberGraphX (or something compatible) and have a monitor that supports the DPMS standard, Magnus Holmgren's CGXDPMS (aminet: util/cdity/CGXDPMS.lha) will let your computer automatically step the monitor through the DPMS power saving modes: On, Stand-by, Suspend and Off. The number of minutes of inactivity to wait before each mode can be set using tooltypes. CGXDPMS is free and a must have commodity.

A Watchful Eye
One of the common difficulties with installing software is that it does not always work the first time you try it. Sometimes the problem is caused by a missing or misplaced library, font or other necessary file. Unfortunately, most programs assume that everything is where it should be and do not bother to tell you if that is not the case. Many times the program will not start or in worse cases the program causes a system crash.

These problems can be extremely difficult, if at all possible, to track down. But, there is help available. Eddy Carroll's SnoopDos (aminet: util/moni/SnoopDos.lha) is a must have for every Amiga user. SnoopDos works by installing a patch that monitors various system and AmigaDOS function calls. With SnoopDos on the watch, you will be told what an application is looking for, where it expects to find it and a lot of other possibly important information. Too much information? No need to worry, because SnoopDos can be told what function calls to monitor. SnoopDos is free.

All-In-One
There are literally hundreds of commodities available, each performing some unique system enhancement. Many of these provide very handy functions, so it is not unusual to want several running simultaneously. The inefficiency in this is that each of these utilities require a certain amount of similar program code to make them function as commodities. In many cases, the actual feature that the commodity provides takes up a very small fraction of the program's size.

And so... multi-function commodities were born. This is not a complete solution, because a particular multi-function program may not include all the functions you want. But, it certainly can cut down on wasted memory.

Several of these all-in-one programs are available. Among the best is Martin Berndt's MultiCX (aminet: util/cdity/MCX280.lha) which currently has more than 50 functions, including: screen and mouse blanking, window and screen cycling, opaque windows, window auto-activation, public screen selection and activation, drive protection, trackdisk.device parameters, popcli, ASCII enter, memory flush, advanced string gadget editing, and much more. MultiCX is available for a shareware fee of US$20.

Cut, Copy and Paste
"Clipboard" is a term that refers to a temporary storage area where text and graphics can be copied or cut (called clips), to be later pasted in another location. Many applications have direct support for a clipboard, but there are many that do not.

Nico François's PowerSnap (aminet: util/cdity/PowerSnap22a.lha) adds the ability to copy a region of text, displayed with any non-proportional font on any screen or in any window, and then paste it in other applications. Powersnap is free.

Stephan Rupprecht's SGrab (aminet: util/wb/sgrab.lha) will allow you to copy a graphic from an entire screen, a window or a selectable region and save it to the clipboard or to a file. SGrab is giftware.

With these tools, you can clip a website address from a message in your email program and paste it into the URL gadget of your web-browser. Or you could grab a section of your workbench screen and paste it into a word processor or dtp program.

One problem though, is that many applications store "clips" in different kinds of clipboards which other programs do not understand. This means that you can not always cut/copy from one application and paste into another. Also, you are usually limited to only pasting the last performed cut/copy. The solution to this is Magnus Holmgren's ClipHistory (aminet: util/cdity/ClipHistory.lha). It provides access to clips from most applications and gives you a pop-up history list to choose from previous clips. ClipHistory is free.

System Monitors
A system monitor is a tool that allows you to monitor your computer resources. It can tell you detailed information about windows, screens, memory, tasks, cpu usage, ports, assigns, expansion boards, interrupts and more. Some even allow you to perform related actions, for example: halting tasks or closing windows and screens.

One of the earliest of these utilities is Werner Gunther's XOper (aminet: util/moni/Xoper28.lha), which provides information in response to various keyboard commands. If you want something with a gadget interface, more functions and network support, try Richard Korber's Scout (aminet: util/moni/Scout.lha). XOper and Scout are free.



Test Results
I did extensive testing of Cyberpatcher, OxyPatcher and the HSMathLibs on an A3000T/25 with a Phase5 Cyberstorm MK2 060/050MHz accelerator. The results were very interesting. Bolded times indicate noticeable improvements that can likely be attributed to the applied patch. Other slightly varied times are more likely a result of background system tasks.

TestOptionsNo PatchCyberpatcherOxyPatcherHSMathLibs
Cold Boot40.00s40.00s62.00s40.00s
Math Tests
mathffptest1,000,000 iterations31.74s31.70s31.70s20.74s
mathieeedoubbasTest1,000,000 iterations22.04s21.80s20.96s
mathieeedoubtransTest1,000,000 iterations197.36s98.98s
mathieeesingbasTest1,000,000 iterations17.34s17.40s17.32s16.58s
mathieeesingtransTest1,000,000 iterations354.70s97.98s
mathtransTest1,000,000 iterations630.30s625.72s104.38s
ImageFX
Clouds4000x4000x24 Default412.74s412.86s412.82s412.18s
PaintFX4000x4000x24 Swarm33.76s33.78s33.7833.82s
Clouds.FP4000x4000x24 Default152.98s138.76s135.48s152.28s
PaintFX.FP4000x4000x24 Swarm31.92s30.20s31.9031.88s
Clouds.FP0604000x4000x24 Default133.12s129.88s133.18s133.08s
PaintFX.FP0604000x4000x24 Swarm31.88s31.54s31.8631.88s
Fractal Generation
FlashMandel_FPU_020+1600x1200x827.00s27.00s27.00s27.00s
FlashMandel_FPU_040+1600x1200x827.00s27.00s27.00s27.00s
FlashMandel_IEEE_ANY1600x1200x8149.00s149.00s149.00s149.00s

HSMathLibs
The author of HSMathLibs claims that, of these three programs, his libraries are the fastest. Unfortunately, I was not able to verify that on the test system used. Although the math tests (????) performed did show a very significant increase in speed, I was not able to find any real-life applications that reflected this. It is important to note that these math speed test programs were provided by the HSMathLibs author. This does not, however, mean that his claims are false.

I have seen various mailing-list posts from people who claim to get very significant improvements when using the HSMathLibs; but, those tests could not be verified by me. Better results could very likely be dependent on the accelerator being used.

There is a note in the Phase5 060 notes that suggests that the HSMathLibs might not be compatible with the 68060.library. This could very well explain why I see no speed change. It states as follows: "You need the original Commodore math libraries and not some custom libs found on aminet! The reason is that I have to patch the library base and don't wanna check out every math library incarnation if it supports the math precision hook."

Cyberpatcher / OxyPatcher
The ImageFX test shows that the best results are obtained when using the 060 versions of ImageFX hooks/modules. When doing so, none of these patches appear to be of any benefit. If you don't have an 060 version of a hook/module, FP versions do show a significant improvement over the standard versions, when used in conjuction with either of these patches. However, most hooks/modules that include FP versions, also include 060 versions. So, for ImageFX, there doesn't appear to be any improvement with any of these patches.

I am unsure what the real compilation difference is between the three versions of FlashMandel. There didn't seem to be any difference with any of the tests.

Final Comments
A demo version of HSMathLibs is available, so it won't cost you anything to test the performance of these libraries with the applications you use. Cyberpatcher is freely available, so again there's no loss in trying it. To the best of my knowledge there is no demo version of OxyronPatcher.