X86中的 IN & OUT 指令是用来做什么的?

我在阅读“理解 Linux 内核”这本书的时候,遇到了这些关于 IN & OUT 指令的问题,我查阅了参考手册。

5.1.9输入/输出指示

这些指令在 处理器的 I/O 端口和一个 寄存器或存储器。

IN    Read from a port
OUT   Write to a port
INS/INSB  Input string from port/Input byte string from port
INS/INSW  Input string from port/Input word string from port
INS/INSD  Input string from port/Input doubleword string from port
OUTS/OUTSB    Output string to port/Output byte string to port
OUTS/OUTSW    Output string to port/Output word string to port
OUTS/OUTSD    Output string to port/Output doubleword string to port

我得到的不少:

  1. “处理器的 I/O 端口”。它们是什么? 为什么我们要从这些端口读写“字符串”?
  2. 我从未遇到过需要使用这些说明的场景。我什么时候需要这些?
  3. 给出一些实际的例子。
97131 次浏览

You know how memory addressing works? There's an address bus, a data bus, and some control lines. The CPU puts the address of a byte (or a beginning byte) of memory on the address bus, then raises the READ signal, and some RAM chip hopefully returns the contents of memory at that address by raising or lowering individual lines (corresponding to bits in the byte(s)) on the data bus. This works for both RAM and ROM.

But then there are also I/O devices: Serial and parallel ports, the driver for a PC's tiny internal speaker, disk controllers, sound chips and so on. And those devices also get read from and written to. They also need to be addressed so the CPU accesses the correct device and (usually) the correct data location within a given device.

For some CPU models including the xxx86 series as found in most "modern" PCs, I/O devices share the address space with memory. Both RAM/ROM and IO devices are connected to the same address, data and control lines. For example, the serial port for COM1 is addressed starting at (hex) 03F8. But there's almost certainly memory at the same address.

Here's a really simple diagram:

[https://qph.ec.quoracdn.net/main-qimg-e510d81162f562d8f671d5900da84d68-c?convert_to_webp=true]

Clearly the CPU needs to talk to either memory or the I/O device, never both. To distinguish between the two, one of the control lines called "M/#IO" asserts whether the CPU wants to talk to memory (line=high) or an I/O device (line=low).

The IN instruction reads from an I/O device, OUT writes. When you use the IN or OUT instructions, the M/#IO is not asserted (held low), so memory doesn't respond and the I/O chip does. For the memory-oriented instructions, M/#IO is asserted so CPU talks to the RAM, and IO devices stay out of the communication.

Under certain conditions the IO devices can drive the data lines and the RAM can read them at the same time. And vice versa. It's called DMA.

Traditionally, serial and printer ports, as well as keyboard, mouse, temperature sensors and so forth were I/O devices. Disks were sort of in between; data transfers would be initiated by I/O commands but the disk controller would usually direct-deposit its data in system memory.

In modern operating systems like Windows or Linux, access to I/O ports is hidden away from "normal" user programs, and there are layers of software, privileged instructions and drivers to deal with the hardware. So in this century, most programmers don't deal with those instructions.

CPU connected to some external controllers through io ports. on old x86 pc i work with floppy drive using I/O ports. if you know what commands accept device controller you can program it through it's ports.

In modern world you will never use ports instructions. Exception if you are (or will be) driver developer.

there is more detailed information about I/O ports http://webster.cs.ucr.edu/AoA/DOS/ch03/CH03-6.html#HEADING6-1

If you're not writing an operating system, then you will never use these instructions.

x86-based machines have two independent address spaces - the memory address space you're familiar with, and then the I/O address space. I/O port addresses are only 16 bits wide, and reference low-level registers and other low-level widgets that are part of an I/O device - something like a serial or parallel port, disk controller, etc.

There are no practical examples because these are only used by device drivers and operating systems.

At the hardware level, most microprocessors have little or no I/O capability built in. A few processors have one or more pins that may be turned on and off using special instructions, and/or one or more pins that may be tested using special branch instructions, but such features are rare. Instead, I/O is usually handled by wiring the system so that accesses to a range of memory addresses will trigger some effect, or by including "in" and "out" instructions which behave like memory load/store operations except that a special signal is output saying "This is an I/O operation instead of a memory operation." In the days of 16-bit processors, there used to be some real advantages to having specialized in/out instructions. Nowadays such advantages are largely moot since one could simply allocate a big chunk of one's address space to I/O and still have plenty left for memory.

Since a program could wreak considerable havoc on a system by inappropriately performing I/O instructions (e.g. such instructions could perform arbitrary disk accesses), all modern operating systems forbid the use of such instructions in user-level code. Some systems may allow such instructions to be virtualized; if user code tries to write to I/O ports 0x3D4 and 0x3D5, for example, an operating system might interpret that as an attempt to set some video-control control registers to move the blinking cursor. Each time the user program performed the OUT instruction, the operating system would take over, see what the user program was trying to do, and act appropriately.

In the vast majority of cases, even if the operating system would translate an IN or OUT instruction into something suitable, it would be more efficient to request the appropriate action from the operating system directly.

Start with something like this:

http://www.cpu-world.com/info/Pinouts/8088.html

You are learning instructions for a very old technology chip/architecture. Back when everything but the processor core was off chip. See the address lines and the data lines and there is a RD read line and WR write line and IO/M line?

There were two types of instructions memory based and I/O based because there were addressable spaces, easily decoded by the IO/M IO or Memory.

Remember you had 74LSxx glue logic, lots of wires and lots of chips to connect a memory to the processor. And memory was just that memory, big expensive chips. If you had a peripheral that needed do do anything useful you also had control registers, the memory might be pixel data, but somewhere you needed to set the horizontal and vertical scan clocks limits, these might be individual 74LSxx latches, NOT memories, having I/O mapped I/O saved on both glue logic and just made a lot of sense from a programmer perspective it also avoided changing your segment registers to aim your 64K memory window around, etc. Memory address space was a sacred resource, esp when you wanted to limit your address decoding to a few bits because every few bits cost you a number of chips and wires.

Like big and little endian memory mapped I/O vs I/O mapped I/O was a religious war. And some of the responses you are going to see to your question are going to reflect the strong opinions that are still around today in the folks that lived it. The reality is that every chip on the market today has multiple busess for various things, you dont hang your real time clock off of the ddr memory bus with an address decoder. Some still even have completely separate instruction and data busses. In a sense Intel won the war for the concept of separate address spaces for different classes of things even though the term I/O port is evil and bad and should not be uttered for say 20-30 more years. You need folks my age that lived it to be retired or gone before the war is truly over. Even the term memory mapped I/O is a thing of the past.

That is really all it ever was, a single address decode bit on the outside of the intel chip that was controlled by the use of specific instructions. Use one set of instructions the bit was on use one set of instructions the bit was off. Want to see something interesting go look at the instruction set for the xmos xcore processors they have lots of things that are instructions instead of memory mapped registers, it takes this I/O mapped I/O thing to a whole new level.

Where it was used is as I described above, you would put things that made sense and you could afford to burn memory address space for like video pixels, network packet memory (maybe), sound card memory (well not that either but you could have), etc. And the control registers, address space relative to the data was very small, maybe only a few registers, were decoded and used in I/O space. the obvious ones are/were serial ports and parallel ports who had little if any storage, you might have had a small fifo on the serial port if anything.

Because address space was scarce it was not uncommon and is still seen today to have memory hidden behind two registers an address register and a data register, this memory is only available through these two registers, it is not memory mapped. so you write the offset into this hidden memory in the address register and you read or write the data register to access the content of the memory. Now because intel had the rep instruction and you could combine it with insb/w outsb/w the hardware decoder would (if you had nice/friendly hardware folks working with you) autoincrement the address whenever you did an I/O cycle. So you could write the starting address in the address register and do a rep outsw and without burning fetch and decode clock cycles in the processor and on the memory bus you could move data pretty fast into or out of the peripheral. This kind of thing is now considered a design flaw thanks to the modern super scalar processors with fetches based on branch prediction, your hardware can experience reads at any time that have nothing to do with executing code, as a result you should NEVER auto increment an address or clear bits in a status register or modify anything as a result of a read to an address. (Editor's note: actually you just make sure your I/O registers with side-effects for read are in uncacheable memory regions/pages. Speculative prefetch of uncacheable memory isn't allowed in the x86 ISA. And can't ever happen for I/O space accesses. But in/out are very slow and partially serializing, and physical memory address space is no longer scarce, so device memory is normally just memory-mapped for efficient access with full-size PCIe transactions.)

The protection mechanisms built into the 386 and on to the present actually make it very easy to access I/O from user space. Depending on what you do for a living, what your company produces, etc. You can most definitely use the in and out family of instructions from user space (application programs in windows and linux, etc) or kernel/driver space, it is your choice. You can also do fun things like take advantage of the virtual machine and use I/O instructions to talk to drivers, but that would probably piss off folks in both the windows and linux worlds, that driver/app wouldnt make it very far. The other posters are correct in that you are likely never going to need to use these instructions unless you are writing drivers, and you are likely never going to write drivers for devices using I/O mapped I/O because you know...the drivers for those legacy devices have already been written. Modern designs most definitely have I/O but it is all memory mapped (from a programmers perspective) and uses memory instructions not I/O instructions. Now the others side if this is DOS is definitely not dead, depending on where you you may be building voting machines or gas pumps or cash registers or a long list of DOS based equipment. In fact if you work somewhere that builds PCs or PC based peripherals or motherboards, DOS based tools are still widely used for testing and for distributing BIOS updates and other similar things. I still run into situations where I have to take code from a current dos test program to write a linux driver. Just like not everyone that can throw or catch a football plays in the NFL, percentage wise very few do software work that involves this kind of stuff. So it is still safe to say these instructions you found are likely not going to be more to you than a history lesson.

Give some practical examples.

First learn how to:

Then:

  1. PS/2 controller: get the scancode ID of the last character typed on the keyboard to al:

    in $0x60, %al
    

    Minimal example

  2. Real Time Clock (RTC): get the wall time with definition of seconds:

    .equ RTCaddress, 0x70
    .equ RTCdata, 0x71
    
    
    /* al contains seconds. */
    mov $0, %al
    out %al, $RTCaddress
    in $RTCdata, %al
    
    
    /* al contains minutes. */
    mov $0x02, %al
    out %al, $RTCaddress
    in $RTCdata, %al
    
    
    /* al contains hour. */
    mov $0x04, %al
    out %al, $RTCaddress
    

    Minimal example

  3. Programmable Interval Timer (PIT): generate one interrupt number 8 every 0x1234 / 1193181 seconds:

    mov $0b00110100, %al
    outb %al, $0x43
    mov $0xFF, %al
    out %al, $0x34
    out %al, $0x12
    

    Minimal example

    A Linux kernel 4.2 usage. There are others.

Tested on: QEMU 2.0.0 Ubuntu 14.04, and real hardware Lenovo ThinkPad T400.

How to find port numbers: Is there a specification of x86 I/O port assignment?

https://github.com/torvalds/linux/blob/v4.2/arch/x86/kernel/setup.c#L646 has a list of many ports used by the Linux kernel.

Other architectures

Not all architectures have such IO dedicated instructions.

In ARM for example, IO is done simply by writing to magic hardware defined memory addresses.

I think this is what https://stackoverflow.com/a/3221839/895245 means by "memory mapped I/O vs I/O mapped I/O".

From a programmer point of view, I prefer the ARM way, since IO instructions already need magic addresses to operate, and we have huge unused address spaces in 64 bit addressing.

See https://stackoverflow.com/a/40063032/895245 for a concrete ARM example.

There is a bit more trickery to it than that. It doesn't just multiplex a separate address space of 64kb onto the same wires with an 'extra address bus/chip select pin'. Intel 8086 and 8088 and their clones also multiplex the data bus and address bus; all very uncommon stuff in CPUs. The datasheets are full of 'minimum/maximum' configuration stuff and all the latch registers you need to hook up to it to make it behave 'normally'. On the other hand, it saves a load of and gates and 'or' gates in address decoding and 64kb should be 'enough i/o ports for everyone' :P.

Also, for all those 'driver developer only' people, take note: besides people using Intel compatible chips in other hardware than just PCs (they were never really intended for use in the IBM PC in the first place - IBM just took them because they were cheap and already on the market), Intel also sells microcontrollers with the same instruction set (Intel Quark) and there are plenty of 'systems on a chip' by other suppliers with the same instruction set as well. I don't think you'll manage to cram anything with separate 'user space' 'kernel' and 'drivers' into 32kb :). For most things such complex 'operating systems' are neither optimal nor desired. Forming some UDP packets in RAM and then putting them into some ring buffer and making some relays go click click does not require a 30mb kernel and 10 second load time, you know. It's basically the best choice in case a PIC microcontroller isn't just quite enough but you don't want a whole industrial PC. So the port I/O instructions do get used a lot and not just by 'driver developers' for larger operating systems.

With names like 'I/O signal' and 'memory mapping' everything is made far more complicated than it actually is, and hence gives the person the impression there is a lot more to it and it covers an advanced topic. The tendency now is that people view it as something new. But this is very far from the case. Even Babbage in the 1830's drove his printer, this needed an I/O signal, albeit done by an axel and cogwheel. E.g. in the machines of Hero of Alexandria 2000 years ago or in theatres right back to Greek times, they always pulled a rope from a set of different ropes to control lights or the scenery, each rope is like an input and output line, it's as simple as that, the address is 'which line' i.e. which thing, memory or device we are choosing, the data is the information you are passing to or reading back from that memory or device.

Although big mainframe computers that filled buildings with cabinets, used things like 64bit right back in the 40's, and therefore dealt with I/O mapping just the same right back then long ago, e..g Konrad Zuse and his room sized computer used floating point that had about 20 digits in decimal in the 1930's, and had to drive things like his printer and his various light bulb indicators and his switches. But on tiny microprocessors the story is different they didn't get envisioned till the 60's and built till 1971. All these techniques using 8bit logic in the 80's, were used for microprocessors in 4bits in the 70's, 2 bits in the 60's and were used in 16bit in the 90's when everybody started to get a computer and hence because it was now in front of them started to discuss this I/O and memory mapping topic for the first time, and it appeared to be something new that came with the advent of the internet; then we had 32bit in the 00's and 64bit computers in the 10's, which caused there to be endless discussions of memory down data lines. In order to answer your question I will be talking about chips that electronics hobbyists bought 30-40 years ago, such as I did at that time, since later on, things got so advanced I was unable to build with the later chips, but the principles are just the same now, the gates are just hidden inside bigger black boxed chips that incorporate other pins that deal with these operations going on much more in parallel (e.g. enabling many octal latches, many chips enabled at once in rows) , and the data and address buses have more lines, that's the only difference.

Well, I don't know anything about all the new languages or how it is on modern pc's now but I can tell you how it was in the old days when I used to build computers with chips.

All I/O Mapping and memory mapping means in simple terms, is if you strung up a load of light bulbs example for some celebration and had wires going to each and called the bulbs Memory locations, (i.e bulbs represent memory in the RAM, either on or off, and if you select location 0 you get wire 0, location 1, wire 1, loc 2 wire 2 and so on) if you added some more wires e.g. one wire is a bell, that particular location is not a memory it is a device, that you output to, using the OUT command, to make it ring. But it is viewed as a memory location from the computer's viewpoint, because it comes in as a wire to the MPU just the same. If another wire was added which was a switch that you operated externally, this is an I/O device, which would be an IN instruction TO the pc. So this is called I/O mapped I/O.

Now on computers, wires on buses represent address lines or data lines, BUT they are in binary, i.e. with 2 wires you can have 00 01 10 11 i.e. 4 combinations 2^2, so with 8 lines 2^8=256 possibilities, with 20 lines 2^20=1048576 with 30 lines 2^30=1073741824 (1 gig) of possibilities with 30 lines. So this is why it's called MAPPED, rather than just saying I/O and memory, they are saying I/O mapped, and memory mapped, because you are mapping the wires AS A COMBINATION y binary coding them. So if say you had 2 wires, 4 combinations, they can't just be connected to bulbs, (not to mention the current amplification required from the tiny voltages from the MPU, and the prevention of feedback current), but the 2 wires have to pass through a decoder (we used to use a 138 to decode 3 lines into 8 lines, a 164 to decode 4 binary lines into 16 lines.) Once through the decoder these 2 lines e.g. A0 and A1 (address0 and address 1 (LINES)), become 4 lines (on or off) for the particular bulb you are driving (in the case on a computer, THE MEMORY), but in some cases these location instead select some Input/output device, and say 'use me' instead, i.e. like memory, once located, data is then passed either one way or the other(using clever tri state logic to cut off voltages on way each time) on the data bus lines D0..7 or D0..31 or whatever size the data on the computer is (you have a 2 bit, 4bit, 8bit, 16bit, 32bit, 64bit, 128bit, 256bit, computer, whatever computer you are building). So the data passes naturally in or out from the data lines to the memory or to the I/O device (IF it is memory mapped), but THIS SHOULD NOT BE CONFUSED WITH THE IN/OUT instructions, THIS IN and OUT means from some OTHER I/O memory block, a special I/O memory block inside the MPU assigned specially just for I/O, i.e. (not memory mapped), this I/O space you don't always get on some microprocessors, e.g. I don't think we had it on a 6502, but we had it on a z80. More artistic chips used just memory mapping, e.g. in games consoles etc, more sensible but uninteresting (stayed in the book) chips go for I/O space as well. Memory mapped I/O is lightening speed since it incorporates memory addressing (which is super fast for the RAM), hence graphics type computer use just memory mapping for the I/O to get the speed. I/O mapped I/O is assigned for slow ports e.g. rs232, or the parallel port, and uses the IN OUT commands.

Now if instead of adding two wires, if you actually replaced two wires that originally went to bulbs and took some of those bulbs and replaced them with other things, e.g a bell on one and a switch on another, these are now not referenced (selected) with the IN and OUT instructions respectively, they are referenced by accessing that particular memory location that selects those wires (that were originally bulbs). So this is memory mapped I/O.

Memory Mapped I/O means that the actual address bus that normally goes to the memory (the RAM), is connected also to OTHER decoders (logic decoders), and when it senses the particular binary combination of address signals, it produces an output high, (e.g. if you had a load of and and not gates, and you said, if this and not that and so on, using pins A0..A20 or whatever size your address bus is), then this high signal ENABLES a latch, (for a particular device, like a serial port, a parallel port), this latch then PASSES the data on the data bus, through to the I/O device. This is for writing to the I/O device. Reading works the opposite way round, the I/O device, passes the data back, and if I remember rightly, it sends the exact same address code combination onto the address lines.

I presume, it must work this way the same today, except they'll just be far more data and address lines.

You literally are WIRING the I/O to the address lines. Hence the I/O is effectively MAPPED into the memory space, as though it were memory. But another latch disables the address pins from accessing the ram at the same time, so that you don't get voltages of two address or data sources on the same line, which would damage the chips.

With the IN and OUT instruction, we had this 40 years ago, on the z80 chip. This is for special cases where the chip actually deals with I/O itself a different way, i.e. it is not memory mapped. (i.e. with memory mapped, you just read or write to the memory location, but with IN and OUT you are already telling the CPU that it is an I/O signal and not memory). So with the IN/OUT instruction, this has its own I/O address space (which is extra to the memory of the ram), this I/O Ram, as it Appears to be, has a set of addresses just the same, except you are directly accessing the device through a decoder attach to those I/O addresses, and you are not access the I/O device from the standard address pins, this is for the IN/OUT instruction.

When you IN and OUT a STRING I don't know x86 but presumably this means you're sending or receiving data on the data bus (using all the data pins D0..D15 or whatever size the data bus is) MANY TIMES IN SERIES at the maximum data rate possible for that particular I/O device (perhaps to do this it uses some kind of handshaking signal you'll have to look it up.) So, the data on the D0..63 lines (or D0..31 on old pc's or D0..15 on late 80's early 90's pcs, or D0..7 or 80's and pre 80's pcs, is in SERIES one after the other, instead of just one time with IN and OUT. I.e. INSTR and OUTSTR is just multiple IN's and OUT's at some defined data rate. E.g. if you were accessing the internet, you'd want a lot of information in and out at each time, so you'd be using in and out of data bytes, which for this case are best passed as strings of ASCII codes for letters and numbers. These commands are exactly the same as if you used the IN and OUT instructions in a loop where the count is the string length.

If you are accessing e.g. the pc speaker you'd just be passing one piece of data at a time using OUT.

If you were reading from the parallel port, you'd be doing IN, and using the code for the I/O address of the port. Writing to it, e.g. to drive old printers or robotics by electronic signals, you'd use the OUT command. The parallel port and serial port (old RS232) are typical ports that were used. The RS232 is serial data, only one bit allowed in or out, so if you were reading from an rs232, you'd only have 1 bit of the byte that is relevant,same with outputting. The baud rate is about 17kHz max for an rs232, but these used to drive electronics a lot, back in the days, I used to build rs232 circuits, e.g. to read voltages or drive PIC micro-controllers. Each port, is named e.g. COM1 COM2 COM3 COM4 and they have I/O addresses. I'm not sure at hand here, but they are similar to e.g. 3F8h 378h (h=hex address)

I am not sure about the modern ports but if you were writing to the USB, this will be most likely memory mapped I/O for greater speed.

The PS/2 keyboard port, I think this uses the IN instruction, to read data from the keyboard. This replaces the old RS232, but has a slightly different spec I believe.

A disk drive was usually memory mapped, presumably it still is now, i.e. you don't drive a disk drive with IN/out instructions, they'd be too slow. But ports are slow anyway so it doesn't matter, e.g a printer is slow as far as data rate required compared to the terrific e.g. 200 Megabyte/second required of a hard disk. A speaker, it only needs the frequency of the sound times about 10 or 20, say 20kHz would be ample for a buzzer, hence it's I/O. Slow things use I/O, the IN/OUT instructions. Hence the USB is probably now memory mapped, you'll have to check up on it.

A better way to understand it is this. On old computers back in the 80's sometimes you wanted to control some device you had built, and had no spec for the output ports (as in those days manufacturer's kept this hidden so that certain companies e.g joystick and cartridge companies) could get ahead in the market by some business deal). What you had to do was open the computer and literally solder wires to some points on the address bus, e.g. you soldered three wires to some points in the circuit at a safe distance (so as not to damage the chip with the heat), those points wired by the circuit board layout to e.g. pins A15 A7 and A1 on the microprocessor. And you'd have to wire also usually a MREQ line (a memory request line and/or the RD/WR line to make a neater signal, and add that into the and or not logic, but if you were clever you could just do it with the address lines) And then you connected these three wires + this extra Ready type signal (e.g. MREQ RD or WR line to give some active low or high (which would need a possible extra NOT gate here) to say DATA is ready on the line NOW) through a 4 input AND gate, which gave an output to an led through a 200 ohm resistor, you have you're own memory mapped high speed I/O to a led light, which you could latch through an SR latch or D type latch to store it in a 1 bit memory externally on some circuit board. Here 15 is the 32K line, 7 is the 64 line, 1 is the 2 line (binary works in powers of 2, so A1 is 2^1, A7 is 2^7, and A15 is 2^15), So if you addressed location 32768+64+2=32834 = F041 in hex, using LDA or STA or LD on old MPU's in the assembler, you would output to this led, it would light up bright if the resistor was say about 100 ohms. So you have done memory mapped I/o, that as simple as it is, you could do it today by soldering to your mpu address lines the same. But you wouldn't do it now due to the delicacy of the circuits. But you could also join the data lines D0..7 (in the old days) or say d0..31 now for 32bit on an old 486 PC. Then if you addressed that location in machine code by doing a load the accumulator with value 8 (mov ax,8 nowadays) or store that accumulator value into an address location (mov F041h,ax accumulator, you would EVEN today get that led to come on. Note, the 8, in the example is what is on the data bus, in this particular case, we are not passing data we are just enabling the particular device (LED is on, if we have selected THAT I/O device, here, just an LED), so in this example it doesn't matter what number we have with that MOV ax,8 instruction, it could be e.g. mov ax,243 and we'd still be enabling the LED on the F041h line when we then do mov F041h,as since we are using the same address. You see, there are address lines and there are data lines. So when you address 3F8 in COM1 or whatever the address is, the I/O memory map is simply sending a signal out to a port, e.g. ps/2, and an and gate is checking if you have 1110000100 on the lines i.e. 11 is 3 1000 is F and 0100 is 8, see binary to hex conversion. If high voltages appear in those bit positions where there is a 1, then the port, e.g. rs232 or ps/2, is set to active, i.e. it is enabled, this enables the latches, by the CE chip enable signal, or CS chip select simple.

On a latch it is the E Enable pin or OE active low output enable. I.e. with the example above described we use the addresses to select (by decoding) WHICH I/O device we want to use (i.e. in the example the LED comes on, if that I/O device is selected. So this is the enable line. THEN, once the I/O device is selected, THEN data is passed from the data bus (D0..7 in the old days, or example D0..63 now for a 64bit computer), via octal latches 373's in the old days, these are D-type flip flop circuits which store the data inside the flip flops. With an active high clock edge, the data passes through and is stored. This clock edge will come from the 'DATA RDY' signal on the data signal, this has various names, I don't know what the name is now. So for 64bit, we have 8 octal latches. And they use bi-directional latches to control data either way, or tri-state, so that when the I/O device is not used, the data lines are in the high impedance state. So therefore you select the I/O device with a combination on the address lines, this is the number, e.g. 3f8h in OUT 3F8h, 7, and the data, here in the example 7, is what is passed on the data lines, in the OUT command the data is passing OUT to the data latch, and out to the I/O device. If you had IN, you'd be doing a command e.g. IN 3f8h,800h, (I expect, but I don't know the syntax of x86 assembler), what I mean is, for IN, you are inputting the data from the data lines (after having selected the address, e.g. here 3f7h, which selects THAT I/O device), this data comes from the I/O device, through the D-type flip flops in the data latch (one for each bit of the data bus lines), and is input to the D0..7 or (D0..63 on modern pcs) pins on the MPU Micro-processing unit). In this example I put IN 3f8h, 800h, to show that once the data comes in it is then stored into address 800h. The syntax of x86 I think is different, you'd have to do probably IN 3f8h, ah or something similar, i.e. into a register first with the data coming in, then you'd MOV 800h, ah i.e. mov the data into the memory location in the RAM, (If you wanted to store it), or do something else with ah etc. ah is an example register, it could be any, al, bh, bl etc whatever, but check the syntax, every assembler system is slightly different, I am not an expert on x86. Again, I am using 3f8h as an example I/O address, there are hundreds, probably thousands of these addresses, e.g. 378h.. See the I/O memory maps for the IBM PC, for full lists.

Whereas when you access the memory(the RAM, e.g. 64byte static rams and dynamic RAMs in the 70's, 8K SRAMs and DRAMs in the 80's, rows of SIMMS each having a few megabytes each(single in line memory module) in the 90's and now is in the form of DDR modules containing DIMMs, dual in line memory modules, I haven't checked but the latest probably no doubt each have a few gigabytes on each little chip), if it isn't an I/O address (very few addresses are I/O addresses, nowadays memory is millions of times or more likely to be in the address space than I/O on a modern pc), you still use the same read write data instructions to memory, but you aren't driving some external logic circuits that look for those bits, instead those address and data pins are wired directly to the RAM chips.

In machine code, I/O and memory addressing just appears the same, as though they are both memory accesses, but what physically goes on is totally different in the actual electronic circuit.