• 0 Posts
  • 8 Comments
Joined 1 year ago
cake
Cake day: June 14th, 2023

help-circle



  • There’s some weirdness on that because she did some important but not-very-public work at IBM in the 60s with their ACS/“Project Y” effort that did what we later call superscalar/multi-issue processors like …20 years before those terms existed. As part of that she wrote a paper about “Dynamic Instruction Scheduling” in 1966 under her pre-transition identity that is a like retroactive first cause for a bunch of computer architecture ideas.

    There was almost nothing about that work in public until Mark Smotherman was doing some history of computing work in the late 90s, put out a call for information about it, and she produced a huge trove of insider information after deciding it was worth exposing the provenance. There’s a neat long-form LATimes piece about the situation which is probably the primary source for the history in OP’s link.


  • NeXTStep became MacOS (OS X and later) and related systems. It does contain quite a number of BSD pieces that are now periodically pulled from FreeBSD.

    The story is fun and I like telling it, strap in:

    NeXTStep was basically the MACH microkernel hybridized with BSD (actual Berkley BSD before the court cases and diaspora that lead to Net/Free/Open BSD and Solaris and such) parts to make a reasonably performant modern design Unix-like, with a fancy PostScript based display layer on top, lead by Steve Jobs after he was ejected from Apple, a bunch of folks he effectively took with him, and Avie Tevanian who was the major force behind Mach when it was a research project at CMU.

    Between 1988 and 1996 Apple failed like 3 times at building their own next-gen OS. Apple and IBM cooperatively fucked up the Pink/Taligent development process so hard it’s still told like a ghost story to software developers, Copland got out of control with feature creep and empire building, and A/UX, cool as it was, was never going to be a mainstream OS because it contained Unix-brand-Unix and the associated thousand dollar license fee (Also IBM got involved combining some future development of A/UX and AIX for PowerPC and some of the Taligent stuff entered the picture and it turned into a clusterfuck).

    So it’s the mid 90s and Apple is shipping an OS from the mid 80s that they had to do hacky shit to make do even cooperative multitasking, and the executives are looking to acquire one that is already done.

    Two former heads of the Mac division had left and built companies that tried to build whole computer platforms then pivoted to just selling software, Steve Jobs with NeXT and Jean-Louis Gassée with Be. The exact details of the negotiations about how they chose NeXT over Be are kind of ambiguous and vary from account to account, I’m personally of the opinion that the biggest reason is that OpenStep looked a lot like “Taligent, but not completely bungled.”

    So Apple Bought NeXT in 1997 in a scenario better described as “NeXT Bought Apple with Apple’s Money” because all the executives in charge after the shakeup were NeXT folks.

    There was an initial plan to more or less slap a OS 8 like Copland-looking GUI on the hybridized Mach/BSD kernel and most of the userland from OpenStep (an environment that at the time was called “Yellow Box”), ship it with a virtualization environment called Blue Box to run legacy MacOS programs, and call it a day. You can even play with the missing link Rhapsody from that era, and the Classic environment that early OS X versions had is the direct descendant of that BlueBox compatibility environment.

    Then some of the important software vendors (read: Adobe, whose shitty development practices are why Macs are case insensitive by default to this day) revolted at the idea of having to do ground-up rewrites so Apple designed the Carbon APIs that are kind of a stepping stone between classic MacOS and the native Cocoa APIs that evolved from the Yellow Box deign which are the usual target on modern macOS/iOS.

    (The story of how Windows NT - which is the underlayer of all modern windows systems - is basically “The DEC VMS team got pissed off about obviously-dumb management decisions and were looking to leave DEC while Microsoft became aware that OS/2 was failing largely due to bad IBM decisions, so Microsoft hired the core of that team to write the operating system they were designing, moved the few remaining competent folks over from OS/2 to help, and sold it as Windows” is similarly absurd. As the joke goes, it may not be a coincidence that WNT is VMS incremented).


  • That’s credible.

    I find the hardware architecture and licensing situation with AMD much more appealing than Nivida and really want to like their cards for compute, but they sure make it challenging to recommend.

    I had to do a little dead reckoning with the list of supported targets to find one that did the right thing with the 12CU RDNA2 680M.

    I’ve been meaning to put my findings on the internet since it might be useful to someone else, this is a good a place as any.

    On a fresh Xubuntu 22.04.4 LTS install doing the official ROCm 6.1 setup instructions, using a Minisforum UM690S Ryzen 9 6900HX/64GB/1TB box as the target, and after setting the GPU Memory to 8GB in the EFI before boot so it doesn’t OOM.

    For OpenMP projects, you’ll probably need to install libstdc++-12-dev in addition to the documented stuff because HIP won’t see the cmath libs otherwise (bug), then the <CMakeConfig.txt> mods for adapting a project with accelerator directives to that target are

    find_package(hip REQUIRED)
    list(APPEND CMAKE_PREFIX_PATH /opt/rocm-6.1.0)
    set(CMAKE_CXX_COMPILER ${HIP_HIPCC_EXECUTABLE})
    set(CMAKE_CXX_LINKER   ${HIP_HIPCC_EXECUTABLE})
    target_compile_options(yourtargetname PUBLIC "-lm;-fopenmp;-fopenmp-targets=amdgcn-amd-amdhsa;-Xopenmp-target=amdgcn-amd-amdhsa;-march=gfx1035"
    

    And torch, because I was curious how that would go (after I watched the Docker based suggested method download 30GB of trash then fall over, and did the bare metal install instead) seems to work with PYTORCH_TEST_WITH_ROCM=1 HSA_OVERRIDE_GFX_VERSION=10.3.0 python3 testtorch.py which is the most confidence inspiring.

    Also amdgpu_top is your friend for figuring out if you actually have something on the GPU compute pipes or if it’s just lying and running on the CPU.


  • Neat.

    I set up some basic compute stuff with the ROCm stack on a 6900HX-based mini computer the other week (mostly to see if it was possible as there are some image processing workloads a colleague was hoping to accelerate on a similar host) and noticed that the docs occasionally pretend you could use GTT dynamicly allocated memory for compute tasks, but there was no evidence of it ever having worked for anyone.

    That machine had flexible firmware and 64GB of RAM stuffed in it so I just shuffled the boot time allocation in the EFI to give 8GB to the GPU to make it work, but it’s not elegant.

    It’s also pretty clumsy to actually make things run, lot of “set the magic environment variable because the tool chain will mis-detect the architecture of your unsupported card” and “Inject this wall of text into your CMake list to override libraries with our cooked versions” to make things work. Then it performs like an old GTX1060, which is on one hand impressive for an integrated part in a fairly low wattage machine, and on the other hand is competing with a low-mid range card from 2016.

    Pretty on brand really, they’ve been fucking up their compute stack since before any other vendor was doing the GPGPU thing (abandoning CTM for Stream in like a year).

    I think the OpenMP situation was the least jank of the ways I tried getting something to offload on an APU, but it was also one of the later attempts so maybe I was just getting used to it’s shit.


  • Don’t trust that they’re 100% compatible with mainline Linux, ChromeOS carries some weird patches and proprietary stuff up-stack.

    I have a little Dell Chromebook 11 3189 that I did the Mr.Chromebox Coreboot + Linux thing on, a couple years ago I couldn’t get the (weird i2c) input devices to work right, that has since been fixed in upstream coreboot tables and/or Linux but (as of a couple months ago) still don’t play nice with smaller alternative OSes like NetBSD or a Haiku nightly.

    The Audio situation is technically functional but still a little rough, the way the codec in bay/cherry trail devices is half chipset half external occasionally leads to the audio configuration crapping itself in ways that take some patience and/or expertise to deal with (Why do I suddenly have 20 inoperable sound cards in my pulse audio settings?).

    This particular machine also does some goofy bullshit with 2 IMUs in the halves instead of a fold-back sensor, so the rotation/folding stuff via iio sensors is a little quirky.

    But, they absolutely are fun, cheap hacker toys that are generally easy targets.