Structure From Motion


I have just discovered your site today and it appears to be a very exciting project.

I am doing a masters with my thesis being:
3D reconstruction in the ocean environment combining photography with subsea postioning sensors.

This is project that will produce 3D data from photography using photogrammetry. This is more than the standard SFM approach as full quality and error will be generated from the data. This is required if it is to be used for survey quality mapping projects.

I have two questions:

  1. Are you using any positioning sensors ?
    Gyro,motion sensors and/or acoustic from the surface

  2. Are any members willing to sure imagery with me and I will process and share the results on this forum.

Best regards,




Maybe a little more info on yourself and what you are looking for and I can flick some data through

We have had a number of people (myself included) who have played in SfM have a look through as a quick brush up of where we are up to

Realistically with a base line you can get pretty good results have a look at the image where we had a good tie back to the tape and the measuring sticks in the image


Thanks for your reply and the document on SFM. Much

  My background is that I am a surveyor who has worked in the
    offshore oil and gas industry for a number of year. I am also a
    computer programmer who has worked on many survey related
  I am currently complete a Masters at Curtin University under
    the Spatial Sciences Department.
  I have attached my candidacy paper for this research project.
    [Your mail server wouldn't accept my document. I can send to an
    alternate address if you ]

  The project is a software project that although it can produce
    models similar to Agisoft and VisualSFM, has a more theoretical
    approach with error analysis and full photogrammetric techniques
    being a major component. Basically I have to demonstrate a full
    mathematically understanding of the process involved.

  - Solution of more extensive camera parameters than just the 3
    radial distortion parameters that VisualSFM and Agisoft use.
  - Full error analysis enables eliminating outliers from the
    data to get better solution accuracy.
  - The resulting model will be on a proper mapping grid rather
    than an arbitrary coordinate system that VisualSFM.
  The end result will be a much flexible and a product that can
    be used on various mapping projects were understanding the
    accuracy of the results if required. For instance if fabrication
    was used from the results. I am happy to share my findings and
    software once complete.
  My main aim is to get some data from you with the GoPro and
    especially if there is associated spatial information.
  - Coordinate information of each camera position
  - Information within the image of feature of known dimensions.
  Thanks for any help you may be able to provide.

  Alan Buchanan


Hi Alan,
Your goal of 3D scene reconstructions seems very promising.I’m part of the SLAM research group at Robotics Research Centre, IIIT-Hyderabad. I’m new to underwater SFM , though I’ve done some work around (VO)Visual Odometry and semi-dense scene reconstructions on the KITTI dataset (mostly based on minimizing 3D-2D reprojection errors). I’m not sure how well the standard techniques would work underwater. Is there some good literature on this ?

I’d be very interested to see if we can make use of stereo setups for underwater VO. There are some open source solutions like libviso2 which I have been trying to improve using a version of rao blackwellised particle filter with visual odometric motion model. If the standard model holds underwater with some corrections, this can pave way for improved navigation and drift correction on the openrov itself.

I also came across Prof. Blair Thornton’s work ( ) where they used a mapping device consisting of a camera and sheet laser. The idea there was to have the sheet laser in the camera’s FOV and extract 3D points corresponding to the laser line for a bathymetric model. The color information was recovered on a per pixel basis from the camera for the 3D points by accounting for the motion which was recovered from the navigation information.

If there are any datasets available, I’d love to look into this myself.



Thanks Akshay,

I will find some information for you.

Underwater photogrammetry does work but the challenge is to extend it to
include other sensors to develop and combined solution that generates
real world coordiantes

More than happy to share what i am doing.



Thanks Alan. Very happy that you’ve taken this up for your master’s thesis and are open to sharing results.

For sensor fusion you could be looking at an ekf based approach. Has this been done in the past with VO/IMU? I have tried DVL+IMU in the past. Maybe use visual odom as a continuous source of odometric information. We may need to characterise the noise for that in underwater setting. Then try to fuse rotations from an IMU (VO doesn’t typically give good rotations in my experience). Acoustic bearing and elevation (if available) can be fused as you would in the case of discreet GPS measurements. May have to look into the kind of motion model and linearizations involved for the filter .



DVL+IMU is used extensively is commerical INS systems.

Off the tope of my head I think it could be worthwhile looking out of
the box on this.

I think you could use photography to do a lot of this work.

  1. If the vessel is visible from the ROV then it is fesssible to track
    the vessel with an upward looking camera

  2. A DVL could be emulated from overlapping forward cameras (time tagged
    overlapping photographs)

  3. If there is a cheap acoustic range system this could be included in
    the adjustment.

My project basically will have a serious of photogrammetry tools that
can be used for SFM but also alternate uses.


Thanks @alan.buchanan for the background it gives me a bit of idea of as were to aim the response

First off (and @phinal if you are interested) flick me an email and I can give you some of the raw image data that I have shared with a few others such as @Jim_N) the data is just dumb (no location, no camera pose, just GoPro image data but some of it is on pretty well known sized objects) that has been used to generate a few SfM models

Eg Annie M Miller Undola, Tuggerah to others

My email is in the recaptcha

My understanding of what is out there and what has been implemented (really just hearsay from around the traps)

Agisoft can use an XYZ set of data but does not implement any camera pose data and for XYZ it uses the data as a starting point for camera location to reduce computer grunt (helpful for large data sets)

VisualSFM I believe that Sydney Uni group (and maybe others) have a customised VisualSFM for stereo (1 camera B&W for contrast and a second for colour @ a set distance between for stereo) that has been customised to accept camera pose information

Realistically from a U/W perspective Pose and Z location (depth) are easy even something like the existing laser real time distance measurement plugin used for height above the bottom is somewhat easily doable

Also consider the localised point cloud that can be generated from something like a consumer level multi beam devise that @Stretch has implemented if used in conjunction with visual this has more than enough range

@phinal RE Visual Odometry maybe a chat with @Jim_N would also be worthwhile for some of the VO work he has been associated with and as well as considering lasers Structured light patterns consideration can also be given to structured light pattern sequence projection