@__dev

__dev@lemmy.world · 5 months ago

I’m more familiar with RISC-V than I am with ARM though it’s my understanding they’re quite similar.

ARM/RISC-V are load-store architectures, meaning they divide instructions between loading/storing and doing computation. x86 on the other hand is a register-memory architecture, having instructions that do both computation as well as loading/storing.
ARM/RISC-V also have weaker guarantees as to memory ordering allowing for less synchronization between cores, however RISC-V has an extension to enforce the same guarantees as x86 and Apple’s M-series CPU have a similar extension for ARM. If you want to emulate x86 applications on ARM/RISC-V these kinds of extensions are essential for performance.
ARM/RISC-V instructions are variable width but only in a limited sense. They have “compressed instructions” - 2 bytes instead of 4 - to increase instruction density in order to compete with x86’s true variable width instructions. They’re fairly close in instruction density, though compressed instructions are annoying for compilers to handle due to instruction alignment. 4 byte instructions must be aligned to 4 bytes, so if you have 3 instructions A, B and C but only B has a compressed version then you can’t actually use it because there must be 4 bytes between instructions A and C.
ARM/RISC-V also makes backwards compatibility entirely optional, Apple’s M-series don’t implement 32-bit mode for instance, whereas x86-64 still has “real mode” for running 16 bit operating systems.

There’s also a number of other differences, like the number of registers, page table formats, operating modes, etc, but those are the more fundamental ones I can think of.

Up until your post I had thought it exactly was the size of the instruction set with x86 having lots of very specific multi-step-in-a-single instruction as well as crufty instruction for backwards compatibility (like MPSADBW).

The MPSADBW thing likely comes from the hackaday article on why “x86 needs to die”. The kinda funny thing about that is MPSADBW is actually a really important instruction for (apparently) video decoding; ARM even has a similar instruction called SABD.

x86 does have a large number of instructions (even more so if you want to count the variants of each), but ARM does not have a small number of instructions and a lot of that instruction complexity stops at the decoder. There’s a whole lot more to a CPU than the decoder.

__dev@lemmy.world · 5 months ago

compressed instruction set /= variable-width […]

Oh for sure, but before the days of super-scalars I don’t think the people pushing RISC would have agreed with you. Non-fixed instruction width is prototypically CISC.

For simpler cores it very much does matter, and “simpler core” here can also could mean barely superscalar, but with insane vector width, like one of 1024 GPU cores consisting mostly of APUs, no fancy branch prediction silicon, supporting enough hardware threads to hide latency and keep those APUs saturated. (Yes the RISC-V vector extension has opcodes for gather/scatter in case you’re wondering).

If you can simplify the instruction decoding that’s always a benefit - moreso the more cores you have.

Then, last but not least: RISC-V absolutely deserves the name it has because the whole thing started out at Berkeley.

You’ll get no disagreement from me on that. Maybe you misunderstood what I meant by “CISC-V would be just as exciting”? I meant that if there was a popular, well designed, open source CISC architecture that was looking to be the eventual future of computing instead of RISC-V then that would be just as exciting as RISC-V is now.

__dev@lemmy.world · 5 months ago

The original debate from the 80s that defined what RISC and CISC mean has already been settled and neither of those categories really apply anymore. Today all high performance CPUs are superscalar, use microcode, reorder instructions, have variable width instructions, vector instructions, etc. These are exactly the bits of complexity RISC was supposed to avoid in order to achieve higher clock speeds and therefore better performance. The microcode used in modern CPUs is very RISC like, and the instruction sets of ARM64/RISC-V and their extensions would have likely been called CISC in the 80s. All that to say the whole RISC vs CISC thing doesn’t really apply anymore and neither does it explain any differences between x86 and ARM. There are differences and they do matter, but by an large it’s not due to RISC vs CISC.

As for an example: if we compare the M1 and the 7840u (similar CPUs on a similar process node, one arm64 the other AMD64), the 7840u beats the M1 in performance per watt and outright performance. See https://www.cpu-monkey.com/en/compare_cpu-amd_ryzen_7_7840u-vs-apple_m1. Though the M1 has substantially better battery life than any 7840u laptop, which very clearly has nothing to do with performance per watt but rather design elements adjacent to the CPU.

In conclusion the major benefit of ARM and RISC-V really has very little to do with the ISA itself, but their more open nature allows manufacturers to build products that AMD and Intel can’t or don’t. CISC-V would be just as exciting.

__dev@lemmy.world · 7 months ago

Wrong. Unified memory (UMA) is not an Apple marketing term, it’s a description of a computer architecture that has been in use since at least the 1970’s. For example, game consoles have always used UMA.

Apologies, my google-fu seems to have failed me. Search results are filled with only apple-related results, but I was now able to find stuff from well before. Though nothing older than the 1990s.

While iGPUs have existed for PCs for a long time, they did not use a unified memory architecture.

Do you have an example, because every single one I look up has at least optional UMA support. The reserved RAM was a thing but it wasn’t the entire memory of the GPU instead being reserved for the framebuffer. AFAIK iGPUs have always shared memory like they do today.

It has everything to do with soldering the RAM. One of the reason iGPUs sucked, other than not using UMA, is that GPUs performance is almost limited by memory bandwidth. Compared to VRAM, standard system RAM has much, much less bandwidth causing iGPUs to be slow.

I don’t disagree, I think we were talking past each other here.

LPCAMM is a very recent innovation. Engineering samples weren’t available until late last year and the first products will only hit the market later this year. Maybe this will allow for Macs with user-upgradable RAM in the future.

Here’s a link to buy some from Dell: https://www.dell.com/en-us/shop/dell-camm-memory-upgrade-128-gb-ddr5-3600-mt-s-not-interchangeable-with-sodimm/apd/370-ahfr/memory. Here’s the laptop it ships in: https://www.dell.com/en-au/shop/workstations/precision-7670-workstation/spd/precision-16-7670-laptop. Available since late 2022.

What use is high bandwidth memory if it’s a discrete memory pool with only a super slow PCIe bus to access it?

Discrete VRAM is only really useful for gaming, where you can upload all the assets to VRAM in advance and data practically only flows from CPU to GPU and very little in the opposite direction. Games don’t matter to the majority of users. GPGPU is much more interesting to the general public.

gestures broadly at every current use of dedicated GPUs. Most of the newfangled AI stuff runs on Nvidia DGX servers, which use dedicated GPUs. Games are a big enough industry for dGPUs to exist in the first place.

__dev@lemmy.world · 7 months ago

“unified memory” is an Apple marketing term for what everyone’s been doing for well over a decade. Every single integrated GPU in existence shares memory between the CPU and GPU; that’s how they work. It has nothing to do with soldering the RAM.

You’re right about the bandwidth though, current socketed RAM standards have severe bandwidth limitations which directly limit the performance of integrated GPUs. This again has little to do with being socketed though: LPCAMM supports up to 9.6GT/s, considerably faster than what ships with the latest macs.

This is why user-replaceable RAM and discrete GPUs are going to die out. The overhead and latency of copying all that data back and forth over the relatively slow PCIe bus is just not worth it.

The only way discrete GPUs can possibly be outcompeted is if DDR starts competing with GDDR and/or HBM in terms of bandwidth, and there’s zero indication of that ever happening. Apple needs to puts a whole 128GB of LPDDR in their system to be comparable (in bandwidth) to literally 10 year old dedicated GPUs - the 780ti had over 300GB/s of memory bandwidth with a measly 3GB of capacity. DDR is simply not a good choice GPUs.

__dev@lemmy.world · 7 months ago

That’s kinda true, in a sense that all batteries use a chemical reaction to generate electricity and a damaged battery can short and thus ignite arbitrarily. But there’s lithium-based batteries like LiFePo₄ that burn significantly less intensely if at all; and there’s lab-only chemistries that are non-flammable. So it’s not really because of the lithium specifically that they burn so well.

__dev@lemmy.world · 11 months ago

The only one I’ve seen is the VW “e-up!”.

__dev@lemmy.world · 11 months ago

There’s vulnerabilities like the recent iMessage exploit that are executed remotely through no interaction by the user. In combination with the ability to self-spread you get mass exploits like WannaCry which spread to 300k+ computers in 7 hours. All you need is a network connection.

__dev@lemmy.world · 1 year ago

So you push digital goods to a robust public platform like IPFS and tie decryption to a signed, non-revokable, rights token that you own on a block chain.

What you describe is fundamentally impossible. In order to decrypt something you need a decryption key. Put that on the blockchain and anyone can decrypt it.

Even if you can, pirates would only need to buy a single decryption key and suddenly your movie might as well be freely available to download. Pirates never pay hosting fees because it’s using the same infrastructure as customers and they can’t be taken down because they’re indistinguishable from customers.

__dev@lemmy.world · 1 year ago

Adding blockchain into the mix changes nothing. Whether your digital ownership is stored in their centralized database or a distributed database, they still have control over everything because they’re the ones streaming it to you. They can just as well block your access & block resale.

The only way to actually digitally own something is to have a full DRM-free copy of it (ianal though this still might not be enough to allow resale).

__dev@lemmy.world · 1 year ago

Apple still uses intel chips in all their macs, just not for the CPU. The M1 Macbook for instances uses an Intel JHL8040R thunderbolt 4 chip.

__dev@lemmy.world · 1 year ago

It’s a little complicated. A USB-3 connection must provide higher current 900mA than a USB-2 connection 500mA. As such a USB-3 data connection can charge faster than a USB-2 connection - some people may call this “fast charging”.

However USB-PD (Power Delivery, aka fast charging) was released as part of the USB 3.1 specification, but it does not require a USB-3 data connection and neither does a USB-3 data connection require USB-PD. You can see all the different USB-C modes on Wikipedia as well, where USB-2 and Power Delivery are listed separately: https://en.wikipedia.org/wiki/USB-C#USB-C_receptacle_pin_usage_in_different_modes

__dev@lemmy.world · 1 year ago

If it was free to use then AMD would support it too

They do. There’s thunderbolt motherboards and it’s coming with USB-4 on the new 7000-series mobile chips.

__dev@lemmy.world · 1 year ago

You’d need to collect the condensate, but that would actually work quite well.

__dev@lemmy.world · 1 year ago

I know you’re talking about the current AI hype cycle, but it was a buzzword in the 60s and again in the 80s.

__dev@lemmy.world · 1 year ago

I don’t see how that’s related at all. Having deterministic builds only matters if you’re building a binary from source, if you’re working with some distributed binary you’ll be applying the patch to identical binaries anyway. And if a new binary is distributed, that’s going to be because something in the source was changed; deterministic builds will still give you a different binary if the source changes.

Binary patching is still common, both for getting around DRM and for software updates.

__dev@lemmy.world · 1 year ago

For a phone who’s ethos is sustainability buying a 2nd device just for music is antithetical. When my FP3 eventually goes out of support I’ll have to look elsewhere.

__dev@lemmy.world · 1 year ago

This isn’t the full picture of those statistics. 10097 games have Platinum or Gold on protondb, out of 11223 with any results at all. It’s not that there’s 60k games broken on Linux, it’s that there just isn’t any data on those.

The only correct thing to do here is to extrapolate from the data we have, which is around ~90% of games work on Linux. So it’s more like 63 000 vs 70 000.

__dev@lemmy.world · 1 year ago

Completely agree with you. A book by default is not a reliable source, published or otherwise. It’s the scientific studies it quotes that are.

__dev@lemmy.world · 1 year ago

You say that like Apple would have to put in a ton of work for that. Android can already run on iPhones. It’s just an ARM computer. Project Sandcastle already exists. All they have to do is allow unlocking the bootloader just like they do on macs.