Notes
Notes for vil

Notes for vil/vil2

Good points about existing vil1

Problems with existing vil1

There are some fundamental divisions

data directly accessible v data downloadable
data type known at compile time v data type known at run time
data ordering known a priori v data ordering known after image loading

Do these not indicate that trying to have a single image type (or hierarchy) is incorrect? Does the single vil1_image hierarchy actually give you anything?

vil merges all the assumptions on the left in vil_image_view<T>, and those on the right in vil_image_resource

Background

Before ISBE, Manchester adopted VXL we used an image type which provided

  1. Good plane support
  2. Arithmetic indexing, exposed in the interface
  3. A world to image transform
  4. A polymorphic class hierarchy whose base class could represent images of any dimension.

We heavily used all these features, none of which were available in vil1. To get round this problem we wrote our own public VXL-compatible image library - mul/mil. It did all the above stuff, using vil1 to do the file loading and saving. We also designed the library in terms of views of images. We first informally raised the idea of rewriting vil1 to support our code at the Zurich VXL meeting in March 2001, with encouraging responses. After spending 18 months noticing that we were duplicating more and more code that was available in vil1, the short-notice arrangement of a meeting in Providence on the 27th Sep 2002, and the encouraging comments from other VXL members gave us the spur to design this proposal.

Philosophy

We want a single "normal" and "easy" image class that we can point new users at. This should also be the default class for doing actual pixel manipulation.

We want to keep the type proliferation down. In particular, we do not want to have to write or maintain vil_image_process_func() for more image types than absolutely necessary. So we enforce the following restriction in vil's API

By keeping this division clean we can avoid type proliferation. There are limited exceptional cases. One is when you absolutely must have a function that works on a view without knowing its pixel type - e.g. vil_load("filename") which returns a vil_image_view_base_sptr.

This division leads naturally to a division of labour.

A vil_image_view<T> is a view in the sense that it provides a view of some image data. You can make concrete copies of views cheaply, without needing to copy the underlying image data. As a service the view memory-manages the image data as well, so you don't need to worry about the underlying data at all.

vil_image_resource is the abstract base of polymorphic class tree. The design is such that most work can be done on the base class API. You can cast it to its concrete type if you need to set some obscure property. To get a functional style, most operations on a vil_image_resource, should be on a vil_image_resource_sptr.

We intend to uphold this API design rigorously. Whilst we have no intention of preventing users from doing whatever they like with their own code, we believe that the current design is locally optimal within a wide radius of convergence. We think that any VXL level-2 libraries should use these API guidelines as well.

Design decisions:

vil_image_view<T>::set_size(..) should be virtual and in the base class

It is in mil_image_2d. Resizing can be slow so virtual function is no problem. It can be useful to set the size of a image without knowing its type. Decision IMS and TFC.

vil_memory_chunk should not be templated.

You could have a memory chunk containing rgb, and choose to view it as a vil_image_view<vil_rgb<vxl_byte>> with nplanes=1, or vil_image_view<char> with nplanes =3; It is hard to get the type of the smart pointers correct without linking the types T of the vil_memory_chunk<T> and vil_image_view<T>. We can put a image type in a member variable if it is necessary (e.g. for vsl IO). Decision IMS and TFC.

Use (i, j, p) default names for index parameters, and ni(), nj(), nplanes(), for sizes.

Want to avoid use of (x,y) to avoid giving the impression that pixels are in a Cartesian co-ord system. It is good to link index name and size name to avoid for loop mistakes. Alternative was row, col, plane, and rows, cols, and planes. Inner loops should have short variable names. Decision at Providence meeting.

vil_image_view<T> should have a sparse interface.

Methods like set_to_window(..), flip_ud(), transpose(), can all be provided as standalone functions without much addition overhead. Class API's should be small rather than large. Use Doxygen's \relates option to help users find related functions. Decision IMS.

Using arithmetic indexing scheme

i.e.
vil_image_view<T>::operator(i,j,p) {return *(top_left_ptr + i * istep + j* jstep + p*planestep);}

Don't use pointer indexing.
vil_image_view<T>::operator(i,j) {return raster_ptrs[j][i];}
Multiplication is fast on modern processors. Pointers require extra memory lookup which is slower, and can reduce cache hit rate. Pointers do not generalise well to planes, or 3D. Decision confirmed at Providence meeting.

Best order for index parameters is i,j,plane.

There is no preferred indexing order with arithmetic indexing. You don't know whether the planestep is 1 or istep is 1. However, for ease of use it is worth having a consistent interface. Two alternatives are (plane, i, j) and (i, j, plane). The former is probably more natural, however, the latter allows us to use default argument plane=0, which is very useful for keeping interface clutter down. Decision at Providence meeting.

Use vil_image_view_base_sptr in preference to vil_image_view_base *.

Main uses of this are in vil_load(..) and vil_image_resource::get_view(..). These functions cannot return a real vil_image_view<T> because that would involve knowing T in advance. They cannot return references because the real object will get destroyed as the function ends before it is assigned to a variable in the calling level. We use the smart over the raw pointer to avoid a likely source of memory leaks. However, most code in vil should operate on vil_image_view<T>. Decision at Providence meeting.

vil_image_view<T> should not be derived from vil_image_resource

As explained above these are actually two different types. Efficient access to ni, nj for vil_image_view<T> whilst matching interface for vil_image_resource would require a lot of complexity. Decision confirmed at Providence meeting.

There should be no general vil_image_resource::set_properties(..)?

Its existence would imply the ability to set arbitrary properties. You can always have a specialist set_obscure_tiff_property(..) as a member function of vil_tiff_file_image. Decision by IMS and AWF. There will be a vil_image_resource::get_properties(..) with all properties documented in vil/vil_properties.h. This allows for partially shared properties such as bits per component or physical pixel height

vil_image_resource::get_view(..), etc. will not allow you to specify planes.

It is easy to split off individual planes later. Although plane specifications could give you a memory advantage - a factor of 3 is not big enough to be useful for v. large images. Not having planes specifications can make programming vil_file_image_resource::get_view(..), etc. a lot easier and potentially faster. Decision at Providence meeting.

Index and size types should be unsigned int.

Signed has the advantage that it is much easier to pick up -ne overflow errors. Unsigned has the advantage that it is more natural, makes the fixed 0,0 origin clear, and it reduces the number of assertion checks needed (admittedly by exposing you to the -ne overflow errors:) We can catch some -ne overflow errors by assert (i != (unsigned)(-1)). Decision at Providence meeting.

Use function overloading and normal filename prefix for functions on vil_image_view<T>, and functions on vil_image_resource objects.

Where function overloading doesn't discriminate (e.g. vil_load("filename") ) then use vil_load_image_resource("filename") for vil_image_resource objects. Decision IMS and TFC

vil_image_view<T>::set_size(ni, nj) will make no changes if size is same. It will detach from original underlying data and create new data if different.

This has some odd effects. If you have multiple views to the same data then set_size may or may not move view to different data. However, it is very useful when treating views as ordinary images, and allows very efficient use of workspace images. For example multiple calls to an image processing class often need identically sized image workspaces. Using this set_size means that it doesn't have to reallocate memory on a regular basis. mil_image has been using this design for 18 months with no problems. Decision at Providence meeting.

vil_image_view<T>::operator=() can be smart but should do fast view transforms only.

Allowing
vil_image_view<vil_rgb<vxl_byte> > im_rgb(ni, nj);
vil_image_view<vxl_byte> im_planes = im_rgb;

and
vil_image_view<vil_rgb<vxl_byte>> pnm = vil_load("filename.ppm");
makes for easy use. However , expensive operations should not have short names. So this should not include doing conversions such as byte to float conversion. These conversions should be done by separately named functions. Decision at Providence meeting.

vil_image_view<T>::operator=() should take base class and base class sptr.

Equals is defined as vil_image_view<T>::operator=(const vil_image_view_base &. This is an operator= from its own base class. This doesn't appear to cause any problems, and allows the above fast view transformations between pixel types. Additionally vil_image_view<T>::operator=(const vil_image_view_base_sptr &) allows for the simple conversion from the return value of many functions such as vil_load. These need to return base class sptrs to reduce memory leaks and because their interface cannot know the pixel type. Decision IMS

Should vil_load should return a base class sptr.

(Similarly vil_image_resource::get_view(..)) They could return vil_image_view_base * or vil_image_view_sptr Both appear to work. The smart pointer version has the advantage of not needing users to be careful about deletion. Decision by consensus on the reflector.

All pixel types should be explicit about type length (e.g. vxl_uint_16)

Alternative is to have vil_image_resource::get_view() automatically pick the best standard type for that platform (e.g. short) The problem with this becomes clear on 64 bit platforms. What types should 16bit or 32bit data get loaded into (The same platform can't define short as both.). Decision at Providence meeting. Question - do the float types need to be specific or can we assume IEEE standard lengths? 

All operations will default to scalar pixels and multiple planes. vil_image_resource hierarchy will mostly work only with scalar pixels.

Nothing prevents you from using vil_rgb pixels, and we will provide appropriate support. However planes are more general than components, and vil_image_resource hierarchy can be simpler if it only thinks about scalar pixels. In particular it may not be feasible to consider every possible compound pixel type in a switch() statement, but it is feasible to consider every possible scalar pixel type. Decision confirmed at Providence meeting.

Base class image type polymorphic in dimension and pixel type, and containing world to image transforms will be provided in a separate level2 library - vimt.

vimt benefits from being able to use both vil and vgl (and possibly vnl). Decision TFC.

Deal with const pointer data issues pragmatically inside vil.

Pointers to data are often passed in as const, but may have the const removed. This is because it is rather complicated to keep track of whether the data is going to be changed or not. Just try to be sensible, OK? Decision TFC.

All standard image processing algorithms will go into vil/algo

We will dump the vipl interface. All image processing code would be written in terms of vil_image_view<T> first. Then write another function which uses the first function, but does all the stitching stuff. Decision at Providence meeting.

All vil/algo image processing functions will have their input images as the first parameters.

e.g. vil_sobel<T>(const vil_image_view<T>&source, vil_image_view<T> &dest_i, vil_image_view<T> &dest_j);

This is different from the operator=(..) style, but is more common in other function libraries. Decision at Providence meeting.

Best names are vil_image_view, vil_image_resource

A beginner will expect to see the word image in the main image type. However it aids understanding to be reminded that this is a view so vil_image_view. vil_image_resource was originally vil_image_data. However that was thought to be confusing and might get confused with the vil_memory_chunk class. So vil_image_resource. Decision at Providence meeting and by vote on vxl-maintainers.

Use set_size(..) rather than resize(..).

The STL has introduced the notion of a data preserving resize. Since our code doesn't do that, we use set_size(..) instead. Decision Amitha and consensus on the reflector.

raster and plane step types should be vcl_ptrdiff_t.

This is strictly the correct type to add to a pointer to get another pointer. It isn't signed int on platforms with 64-bit address bus but 32-bit data. Decision IMS.

vcl_complex<float> is a scalar pixel type.

Whilst useful under certain circumstances, it is not sensible default behaviour to allow vil to treat a complex<float> as a two plane float image. Therefore vil_image_resource derivatives should treat complex<float> as ones of its default pixel types. Decision - FW and IMS.

vil_image_view<T> should have a constructor which allows interleaved or separated plane storage.

This is the single most useful specialist memory format for a view, and so deserves to be easily created. Decision PVr and IMS.

There should be versions of vil_convert_*() which take and return vil_image_view_base_ptrs.

There are already versions which take vil_image_view<T>s. However when loading an image it is often useful to force it into a particular pixel-type, arrangement, etc. We cannot assign it to a concrete vil_image_view<T> because that will fail if the types are incompatible. If we want to allow vil_load() to remain easy to use then we cannot solve this problem by providing vil_convert(vil_image_resource_sptr). This is the 3rd place where we allow the use of vil_image_view_base_ptrs (after vil_load() and vil_image_resource::get_view().) Decision IMS and TFC.

n.b. consistency suggests providing vil_convert(vil_image_resource_sptr) versions.

file resources should return pixel values in least significant bit position.

File types that have odd pixel widths (e.g. DICOM can use 12-bit images) should not scale those pixels to fill the full range of the C++ pixel-type that they return. So our 12-bit DICOM image may return vxl_uint_16 pixels with values from 0 to 4095. Decision IMS, TFC and Marc Laymon (GE).

 

Remaining questions

There is now no way to get a bit-compressed pbm file into memory efficiently. Is this a problem? We could solve it by deriving a vil_image_view_of_compressed_bits, or similar.

Possible improvements

Add a get_pixel(void *buf,i,j) and put_pixel(void *buf,i,j) to vil_image_resource.

These might be useful for efficient implementation of some algorithms. They could be implemented in terms of get_copy_view(..) and put_view(..) in the base class. However, in some cases you need to probe/set a pixel at a time and the overhead of allocating a vil_image_view<T> for each get_view(..) is quite expensive. Note that the cost of allocating the vil_memory_chunk is irrelevant. If the image is already in memory, no new vil_memory_chunk is allocated. If the image isn't already in memory, the cost of the new memory chunk is smaller than the disk access time. Note, one alternative may be to bring back the vil_image_resource::fill_view(...) which modifies the contents of an existing view and doesn't allocate any memory. IMS.

Add block processing implementation to the vil_image_resource derivatives for very large images.

The basic idea of dealing with large images was designed into vil1. It deferred loading the whole image into memory until actually needed. You could then load only one block of the image as required. This basic idea has been passed into vil2/vil. However, as with vil1, this is only a design outline and API. A lot of implementation to do block-by-block processing efficiently is missing.

The first step is to add preferred_block_size and preferred_block_origin to vil_property.h. Both of these unsigned[2] properties would be provided by file resources and used by the processing resources to do the work one block at a time. I guess many of the existing file resources will specify blocks that are one or more full rasters. Those file formats designed to deal with large images (including TIFF) can store their pixels in rectangular tiles. The processing resources should be able to use several of their supplier's blocks together up to some locally decided (or globally specified) limit. The processing resources would need to export these properties including any modification (e.g. vil_transpose().) 

Quite how vil_convolve_1d_resource should deal with a single column for a preferred block size (when vil_convolve_1d_resource operates on rows) - I don't know. It would take some experimentation to get an efficient solution. Until then we can always rely on the fact that the preferred_block_size, etc are just hints and can safely be ignored at the loss of efficiency. IMS.