| This article is part of the LWN Porting Drivers to 2.6 series. |
int remap_page_range(struct vm_area_struct *vma, unsigned long from,
unsigned long to, unsigned long size,
pgprot_t prot);
remap_page_range() is now explicitly documented as requiring that the memory management semaphore (usually current->mm->mmap_sem) be held when the function is called. Drivers will almost invariably call remap_page_range() from their mmap() method, where that semaphore is already held. So, in other words, driver writers do not normally need to worry about acquiring mmap_sem themselves. If you use remap_page_range() from somewhere other than your mmap() method, however, do be sure you have acquired the semaphore first.
Note that, if you are remapping into I/O space, you may want to use:
int io_remap_page_range(struct vm_area_struct *vma, unsigned long from,
unsigned long to, unsigned long size,
pgprot_t prot);
On all architectures other than SPARC, io_remap_page_range() is just another name for remap_page_range(). On SPARC systems, however, io_remap_page_range() uses the systems I/O mapping hardware to provide access to I/O memory.
remap_page_range() retains its longstanding limitation: it cannot be used to remap most system RAM. Thus, it works well for I/O memory areas, but not for internal buffers. For that case, it is necessary to define a nopage() method. (Yes, if you are curious, the "mark pages reserved" hack still works as a way of getting around this limitation, but its use is strongly discouraged).
The nopage() method made it through the entire 2.5 development series without changes, only to be modified in the 2.6.1 release. The prototype for that function used to be:
struct page *(*nopage)(struct vm_area_struct *area,
unsigned long address,
int unused);
As of 2.6.1, the unused argument is no longer unused, and the prototype has changed to:
struct page *(*nopage)(struct vm_area_struct *area,
unsigned long address,
int *type);
The type argument is now used to return the type of the page fault; VM_FAULT_MINOR would indicate a minor fault - one where the page was in memory, and all that was needed was a page table fixup. A return of VM_FAULT_MAJOR would, instead, indicate that the page had to be fetched from disk. Driver code using nopage() to implement a device mapping would probably return VM_FAULT_MINOR. In-tree code checks whether type is NULL before assigning the fault type; other users would be well advised to do the same.
There are a couple of other things worth mentioning. One is that the vm_operations_struct is rather smaller than it was in 2.4.0; the protect(), swapout(), sync(), unmap(), and wppage() methods have all gone away (they were actually deleted in 2.4.2). Device drivers made little use of these methods, and should not be affected by their removal.
There is also one new vm_operations_struct method:
int (*populate)(struct vm_area_struct *area, unsigned long address,
unsigned long len, pgprot_t prot, unsigned long pgoff,
int nonblock);
The populate() method was added in 2.5.46; its purpose is to "prefault" pages within a VMA. A device driver could certainly implement this method by simply invoking its nopage() method for each page within the given range, then using:
int install_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long addr, struct page *page,
pgprot_t prot);
to create the page table entries. In practice, however, there is no real advantage to doing things in this way. No driver in the mainline (2.5.67) kernel tree implements the populate() method.
Finally, one use of nopage() is to allow a user process to map a kernel buffer which was created with vmalloc(). In the past, a driver had to walk through the page tables to find a struct page corresponding to a vmalloc() address. As of 2.5.5 (and 2.4.19), however, all that is needed is a call to:
struct page *vmalloc_to_page(void *address);
This call is not a variant of vmalloc() - it allocates no memory. It simply returns a pointer to the struct page associated with an address obtained from vmalloc().
|
|
|
| Driver porting: supporting mmap() |
I test this on another machine and it succeeded. It seems the 2.6.0
kernrl didn't work normally on the original machine.
But I encounter another question now.
I reserve a 4M memory at boot time. In my driver I support mmap method
by calling remap_page_range() function with (TOTAL_MEMORY_SIZE - 4M) as
the physical memory start address parameter.
In the driver I access the memory block with the pointer returned by
ioremap(TOTAL_MEMORY_SIZE - 4M, 4M). It seems work. In App I write some
data into the pointer returned by mmap, and dump the memory block in
the driver, I find the output isn't the same as I wrote at App side.
What's wrong with the driver?
BTW, my driver supports read/write methods, too. In them I use
copy_to_user/copy_from_user. After write in App, the driver dumps the
memory block, the output is correct.
How can I support mmap correctly?