One example of "immediate mode" is using glBegin and glEnd with glVertex in between them. Another example of "immediate mode" is to use glDrawArrays with a client vertex array (i.e. not a vertex buffer object).
You will usually never want to use immediate mode (except maybe for your first "hello world" program) because it is deprecated functionality and does not offer optimal performance.
The reason why immediate mode is not optimal is that the graphic card is linked directly with your program's flow. The driver cannot tell the GPU to start rendering before glEnd, because it does not know when you will be finished submitting data, and it needs to transfer that data too (which it can only do after glEnd).
Similarly, with a client vertex array, the driver can only pull a copy of your array the moment you call glDrawArrays, and it must block your application while doing so. The reason is that otherwise you could modify (or free) the array's memory before the driver has captured it. It cannot schedule that operation any earlier or later, because it only knows that the data is valid exactly at one point in time.
In contrast to that, if you use for example a vertex buffer object, you fill a buffer with data and hand it to OpenGL. Your process does no longer own this data and can therefore no longer modify it. The driver can rely on this fact and can (even speculatively) upload the data whenever the bus is free.
Any of your later glDrawArrays or glDrawElements calls will just go into a work queue and return immediately (before actually finishing!), so your program keeps submitting commands while at the same time the driver works off one by one. They also likely won't need to wait for the data to arrive, because the driver could already do that much earlier.
Thus, render thread and GPU run asynchronously, every component is busy at all times, which yields better performance.
Immediate mode does have the advantage of being dead simple to use, but then again using OpenGL properly in a non-deprecated way is not precisely rocket science either -- it only takes very little extra work.
Here is the typical OpenGL "Hello World" code in immediate mode:
Edit:
By common request, the same thing in retained mode would look somewhat like this:
float verts = {...};
float colors = {...};
static_assert(sizeof(verts) == sizeof(colors), "");
// not really needed for this example, but mandatory in core profile after GL 3.2
GLuint vao;
glGenVertexArrays(1, &vao);
glBindVertexArray(vao);
GLuint buf[2];
glGenBuffers(2, buf);
// assuming a layout(location = 0) for position and
// layout(location = 1) for color in the vertex shader
// vertex positions
glBindBuffer(GL_ARRAY_BUFFER, buf[0]);
glBufferData(GL_ARRAY_BUFFER, sizeof(verts), verts, GL_STATIC_DRAW);
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, 0);
// copy/paste for color... same code as above. A real, non-trivial program would
// normally use a single buffer for both -- usually with stride (5th param) to
// glVertexAttribPointer -- that presumes interleaving the verts and colors arrays.
// It's somewhat uglier but has better cache performance (ugly does however not
// matter for a real program, since data is loaded from a modelling-tool generated
// binary file anyway).
glBindBuffer(GL_ARRAY_BUFFER, buf[1]);
glBufferData(GL_ARRAY_BUFFER, sizeof(colors), colors, GL_STATIC_DRAW);
glEnableVertexAttribArray(1);
glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 0, 0);
glDrawArrays(GL_TRIANGLES, 0, 3);
the vertex and fragment shader programs are being represented as C-style strings containing GLSL language (vertexShaderSource and fragmentShaderSource) inside a regular C program that runs on the CPU
this C program makes OpenGL calls which compile those strings into GPU code, e.g.:
the shader define their expected inputs, and the C program provides them through a pointer to memory to the GPU code. For example, the fragment shader defines its expected inputs as an array of vertex positions and colors:
"layout (location = 0) in vec3 position;\n"
"layout (location = 1) in vec3 color;\n"
"out vec3 ourColor;\n"
and also defines one of its outputs ourColor as an array of colors, which is then becomes an input to the fragment shader:
We understand therefore that this represents a much more restricted model, since the positions and colors are not arbitrary user-defined arrays in memory anymore, but rather just inputs to a Phong-like model.
In both cases, the rendered output normally goes straight to the video, without passing back through the CPU, although it is possible to read to the CPU e.g. if you want to save them to a file: How to use GLUT/OpenGL to render to a file?
Most "modern" OpenGL tutorials normally retained mode and GLFW, you will find many examples at: