Yes there are performance reasons. Some accesses are effectively under another layer of indirection to get the absolute position in memory.
There is also the GOT (Global offset table) which stores offsets of global variables. To me, this just looks like an IAT fixup table, which is classified as position dependent by wikipedia and a few other sources.
It adds an indirection. With position independent code you have to load the address of your function and then jump to it. Normally the address of the function is already present in the instruction stream.
Also, virtual memory hardware in most modern processors (used by most modern OSes) means that lots of code (all user space apps, barring quirky use of mmap or the like) doesn't need to be position independent. Every program gets its own address space which it thinks starts at zero.
In addition to the accepted answer. One thing that hurts PIC code performance a lot is the lack of "IP relative addressing" on x86. With "IP relative addressing" you could ask for data that is X bytes from the current instruction pointer. This would make PIC code a lot simpler.
Jumps and calls, are usually EIP relative, so those don't really pose a problem. However, accessing data will require a little extra trickery. Sometimes, a register will be temporarily reserved as a "base pointer" to data that the code requires. For example, a common technique is to abuse the way calls work on x86:
call label_1
.dd 0xdeadbeef
.dd 0xfeedf00d
.dd 0x11223344
label_1:
pop ebp ; now ebp holds the address of the first dataword
; this works because the call pushes the **next**
; instructions address
; real code follows
mov eax, [ebp + 4] ; for example i'm accessing the '0xfeedf00d' in a PIC way
This and other techniques add a layer of indirection to the data accesses. For example, the GOT (Global offset table) used by gcc compilers.
x86-64 added a "RIP relative" mode which makes things a lot simpler.
Because implementing completely position independent code adds a constraint to the code generator which can prevent the use of faster operations, or add extra steps to preserve that constraint.
This might be an acceptable trade-off to get multiprocessing without a virtual memory system, where you trust processes to not invade each other's memory and might need to load a particular application at any base address.
In many modern systems the performance trade-offs are different, and a relocating loader is often less expensive (it costs any time code is first loaded) than the best an optimizer can do if it has free reign. Also, the availability of virtual address spaces hides most of the motivation for position independence in the first place.
Nowadays operating system and compiler by default make all the code as position independent code. Try compiling without the -fPIC flag, the code will compile fine but you will just get a warning.OS's like windows use a technique called as memory mapping to achieve this.