The Development of a New Methodology for Automated Sounding Selection on Nautical Charts Razvoj nove metodologije za automatizirani odabir

Conducting a manual sounding selection for display on official nautical charts is timeconsuming and is becoming more challenging because of the high-quality hydrographic data. Boosted by the development of surveying technology, research of automated sounding selection capabilities is a logical step in improving production of nautical charts. In this work a new methodology for automated sounding selection based on areas of sudden change in the sea floor relief is defined. Quantitative parameters of the seafloor obtained from the survey, slope and aspect are used to segregate and classify seafloor features significant for navigation. By detecting their boundaries, principles of sounding selection for each class are applied in order to represent all the relevant information regarding a specific feature. Spatial accuracy analysis is conducted on two large multibeam hydrographic surveys by comparing the obtained results with the automated sounding selection feature within dKart Editor and the manually selected soundings on official nautical charts. The RMSE (Root Mean Square Error) of vertical deviations and its relation to terrain characteristics within the initial quality assessment is encouraging and suggests that the proposed automated methodology represents an improvement compared to dKart and could be applied with the same effectiveness as a manual method.


INTRODUCTION / Uvod
A nautical chart is specifically designed to meet the requirements of marine navigation, showing depths of water, nature of bottom, elevations, configuration and characteristics of coast, dangers and aids to navigation [1]. Measured or charted depths of water are called soundings [1]. Because of its use in marine navigation the cartographer needs to find a balance between a high quality seafloor representation via soundings whilst preserving the required level of clarity and readability. This demanding task is made even more daunting with the rapid development of hydrographic survey. Results of the survey in the form of a fair chart (smooth sheet) are the basics for sounding selection. Although some general guidelines for sounding selection are given in the International Hydrographic Organization (IHO) S-4 publication [2], lack of a more detailed set of principles makes the selection a more subjective process prone to inconsistency. For example, some charts with the same scale (same navigational purpose) can have a very different sounding density based on the preferences of the cartographer. Different survey techniques can also have an impact on sounding density. These are all reasons for researching the automatization of the selection.
Many studies have been done for the automated identification of topographic features on land and seafloor. Probably the most widely examined and cited geomorphological feature extraction processes from DEMs (Digital Elevation Models) concerns the derivation of hydrological and fluvial features from surface models [3]. Wood [3] classified many different automated methods of feature extractions from surface models based on five dichotomous classification criteria. Kweon and Kanade [4] presented an algorithm for extracting topographic features from an elevation map. They constructed a contour map from the elevation map (DEM) after which they created a contour tree where relationships among contour lines are represented. The extraction of features is done by finding and analyzing certain patterns (groups of closed or V shaped contours) in the contour tree. Other studies used a parameterization of the continuously varying topographic surface into a discrete topological data structure or framework as a basis for feature extraction [5,6,7,8]. A prominent example of such surface topology data structures is the Pfaltz's graph or simply surface network. Surface network is a graph theoretic based topological data structure, which has proved to be useful for both characterizing and generalizing the form and topology of topographic surfaces [9]. Stepinski and Jasiewicz [10,11] developed a model for classification and mapping of landforms using geomorphons (geomorphologic phonotype). A geomorphon is a relief-invariant, orientationinvariant and size-flexible abstracted elementary unit of terrain. They used two parameters search radius and relief threshold and defined a set of 498 ternary patterns -geomorphons. In the method based on sounding attribution and depth areas [12] the authors use buffers and attribute adjustment for improving the selection of soundings in obstructions and around depth contours. There is also an ontology-driven multi-agent system [13,14,15,16] where a feature-centered ontology was created and then a multi-agent system uses measure algorithms to decide which generalization operators to apply. In the influence circle method [17,18,19] the user defines a radius of circles that cover the area and selects the shallowest sounding inside. This method in different variations is one of the most used today. Most commercial software designed for production, editing and maintenance of Electronic Navigational Chart -ENC (dKart Editor [20], ENC Designer [21], S-57 Composer [22]) provide a feature for automated sounding selection, based on the influence circle method. One of the tools in the dKart Editor is the automatic Sounding Selection Wizard -SSW. To execute sounding selection successfully, the program needs input of the following parameters: sounding selection algorithm selection and the data generalization area radius -D, different in different depth ranges. The algorithms that can be used for the sounding selection are shoal bias and deep bias algorithm. The shoal bias algorithm in its essential features works as follows: selects the shallowest object among those of the highest priority, marks other objects of the same priority in the D -vicinity of the selected one as 'to be deleted' , goes to the secondshallowest object and performs the same routine, then the third-shallowest and so on [23]. Having processed all objects with the highest priority, the program repeats the same actions with objects of the next lower priority. The deep bias algorithm works analogously with the only difference that it begins with the deepest object, not with the shallowest one. The shoal bias algorithm is considered the main and therefore set as default, while the deep bias is used as an auxiliary to reconstruct the sea bottom profile in deep water regions in more detail. Although the quality of these features can vary by a different use of input parameters of density, a safe assumption would be that their use is currently limited for the selection of typical soundings over relatively flat seafloor areas [24].
In this study a new methodology for automated sounding selection of significant and critical soundings showing unexpected changes in the seafloor is presented. To achieve this, an algorithm is created through which significant seafloor features are detected using slope and aspect calculated from the survey and categorized based on the type of feature they represent. A set of selection rules for each category is defined within the algorithm based on the parameters extrapolated from the existing soundings charted on official charts. The method is tested in two different areas for which a multibeam echo sounder survey was conducted by the Hydrographic Institute of the Republic of Croatia (HHI). Accuracy assessment is made by comparing the acquired results by means of the proposed method with the charted soundings from the HHI official charts selected manually. The influence circle method (dKart Editor) is also compared with the same HHI official charts. In HHI, dkart Editor is used for ENC creation and maintenance.

METHODOLOGY / Metodologija
By analyzing soundings charted on any nautical chart depicted manually it is obvious that their layout and density differs according to the seafloor variability. The best quantitative descriptors of this feature which can be effectively obtained from the survey data are slope and aspect of the seafloor terrain. Slope and aspect are very useful parameters, found in many studies related to underwater relief, habitat suitability, soil erosion, depths and many other fields [25,26,27,28,29]. Since density of manually depicted soundings is related to slope and aspect, by characterization of the seafloor based on these terrain parameters, delineation of important regions of interest for the sounding selection process, areas of significant change in the seafloor can be achieved.

Data processing / Obrada podataka
The entire sounding selection process, whether manual or automated depends on the quality of the hydrographic survey input data. The Cartographic department of the HHI receives a high density multibeam survey data (2 m) and as such is considered suitable for extracting and creating input data for the automated sounding selection within this research. The format of the seafloor model from which the needed parameters are to be calculated can be vector (Triangulated Irregular Network -TIN) or raster (DEM). A choice between vector and raster format depends on the type of GIS analysis we want to perform. The TIN interpolation method works best for creating digital topography from irregularly spaced known elevation points, like points extracted from contour lines. If the elevation points were spaced in a regular gridded fashion (like our input hydrographic data), the elevation values could automatically be converted into a raster DEM that was set up with cells that are the same size as the distance between input survey points (2 m). A continuous surface is created via nearest neighbor interpolation method. The raster resolution will be determined by the multibeam data [30] resulting in a high resolution DEM (2 m grid resolution) so that the terrain analysis can be performed. The influence of data precision on derived slope and aspect is highly related to the grid resolution. While using a high-resolution DEM (e.g. 2 m grid resolution), the influence of data precision becomes quite significant [31]. We can measure the slope and aspect for a cell in the elevation grid by the quantity and direction of the tilt of the cell's normal vector [32]. As a computing algorithm for slope and aspect using grid, Horn's algorithm [33] within QGIS is used on the DEM to acquire the data.
Slope S at a certain cell is defined by using eight neighboring cells e 1 , e 2 , …, e 8 as: where d is the cell size, a weight of 2 is applied to e 2 , e 4 , e 5 , and e 7 , and a weight of 1 to e 1 , e 6 , e 3 , and e 8 [32] Aspect D is defined as:

Segregation of significant seafloor features for navigation / Segregacija značajki morskog dna važnih za navigaciju
After processing the data, a program was created to detect important areas using various parameters, to classify them and to select soundings for each class of features. These parameters have default values that are used in this research, but a user can change and test different values according to its needs. Firstly, input data is imported in the program and stored as a list of tuples. To ensure safe navigation it is necessary to detect features on the seafloor which may be a hazard to navigation, whether natural or man-made. A feature is defined as any item on the seafloor which is distinctly different from the surrounding area [34]. Three criteria are defined within the algorithm with witch the user could determine what seafloor features are significant enough to be highlighted on the chart with a denser selection of soundings. The first criterion is the limit value of the slope. For the purpose of this research and as a future default value within the program, slopes ≥ 10% are used to separate points representing steep slope. In order to group these points a DBSCAN (Density -Based Spatial Clustering of Applications with Noise) algorithm was used. The user needs to define a maximum distance between points at which they are considered as part of the same group (second criterion). The default distance is based on the density of the input data (2 m) and is set at 6 m. This will separate elements of slope that potentially describe a significant feature depending on the last criterion within this stage of the selection process. The third criterion is based on a numerical definition of a significant seafloor feature by the IHO within the Zones of Confidence (ZOC) table in its publication S-57. ZOC is a method of encoding data quality information by classifying all bathymetric data and identifying various levels of confidence that can be placed in underlying data using a combination of depth and position accuracy, seafloor coverage and typical survey characteristics [35]. Other than these, there are also additional parameters, but since this research uses only numerical definition within ZOC, they are not listed. The criteria define significant seafloor features as those that differ from the surrounding area. For features under 10 m the criterion is: For features within areas of 10 -30 m, the difference criterion is 4 m. Finally, for features above 30 m, the criterion is: As with all criteria the user can change these default values. The algorithm calculates the height of a feature by subtracting the mean value of all bordering soundings found nearest to the edge of the element of slope, while not being a part of the element itself, with the minimum and maximum values of soundings inside the element. The absolute value of the larger result is the height of the feature that is to be subjected to the third criterion. The results at this stage are segregated seafloor features (elements of slope) considered significant for navigation.

Classification of segregated elements of slope / Klasifikacija segregiranih elemenata kosine
Every segregated element of slope can represent two types of features: elevations or depressions. The proposed methodology distinguishes them based on whether the before mentioned height of the feature (whose absolute value is used for the third criterion for detecting significant seafloor features) is a negative (depression) or a positive (elevation) value. The algorithm then filters the segregated elements of slope to distinguish those circular or near circular from elongated ones. This has to be done because rules of sounding selection are different for elongated seafloor features like channels from circular ones like seamounts. One of the methods for this [36] is based on a modified perimeter P to area A ratio (P/A)' = O' defined as: (2.5) This results in spherical features having a O' ratio of 1 irrespective of size, and the more elongated a feature is, the greater the value. Features are filtered out so that those with ˂ 2 are considered near circular, and those with ≥ 2 elongated [36].
The near circular elements of slope which whole area consists entirely of slope ≥ 10% are separated from those partly filled elements with parts of slope ˂ 10%. In Figure 1 (left) we see an example of a partly filled slope element with green color indicating steep slope and with a small flat area within (blue). Elongated elements of slope are divided based on the number and the position of elements needed to represent a certain seafloor feature: one isolated element, two joint elements facing each other and two separated elements facing each other. In Figure 1 (right) we see one isolated elongated slope element. The algorithm considers two elements to be facing each other if the difference of their aspects equals 180°±30°. If two elements are separated by more than 5 cm they are considered two isolated elements describing two isolated features.
Rules of sounding selection within the algorithm are based on the number of soundings and the order of selections. The number of soundings differs weather a feature is circular or elongated, while the order of selection is specific and is different for every type of feature. All distance and area values within the selection principles are measured on the chart. For circular features the number of soundings depends on the size of the feature area so that areas ˂ 3.5 cm 2 are shown with a single sounding and one is added for every 3.5 cm 2 . For elongated features the number of soundings depends on their width and length. For widths ˂ 1.5 cm a single sounding is used with one added every 1.5 cm. For lengths ˂ 2.5 cm a single sounding is used with one added every 2.5 cm. These values are based on the median distances acquired by analyzing representations of distinct features and the median size of the soundings themselves on HHI charts. The above values of parameters (and all others within the program) represent default values that are used in this research, but a user is asked to confirm these default values or he can change all the values according to its needs.

Selection order for each feature class / Redoslijed odabira za razred svake značajke
Entirely filled circular elements of slope can describe two seafloor features: seamount or a hole. A seamount is a distinct spire shaped elevation cresting a summit. Depending on the distance to the surface they can represent a serious hazard to navigation. The order of selection is the following -the shallowest sounding will always be selected first (usually located inside the element; sounding 8 m in red circle on Figure 2) and then depending on the size of the seamount the deepest soundings (usually located on the edge of the element; sounding 29 m in red circle on Figure  2) will be added to show the size of the feature. This selection order will ensure the coexistence of selected soundings and depth contours (isobaths) which will be located between the soundings and contribute to a quality display of the feature. Holes are the same shape as the seamount but instead of an elevation they represent a depression. Because of that, unlike seamounts, they are not considered a hazard to navigation thus eliminating the need for more than one sounding, which leaves depth contours and the deepest sounding inside as an adequate representation. In Figure 2 we can see a hole (blue circle) represented with a depth contour (50 m) and a sounding (56 m). manually selected on official navigational chart [37].
Partly filled circular elements of slope can describe three seafloor features: guyot, basin and an island slope. Guyot (tablemount, Figure 1, left) is a seamount with a roughly smooth flat top. The order of selection is: the shallowest sounding will always be selected first (located inside the smooth blue top, sounding 19.7 m in Figure 1, left), then depending on the size, soundings on the outer edge of the feature are added (soundings value 30 m and 36 m in Figure 1, right) to increase the information about this dangerous feature. The part of the element without a slope could be filled with land. This means that the element of slope is circled around an island or a rock. In this case only the deepest soundings on the outer edge are selected to show the beginning of the slope. Basin is the same as guyot but representing a depression. As with holes their navigational significance is marginal and so one deepest sounding is complementary to depth contours for its visualization.
One isolated elongated element of slope can describe two seafloor features: coastal slope and an escarpment. The coastal slope is an elongated version of an island slope and the selection is the same with the deepest soundings on the outer edge are selected to show the beginning of the slope. An escarpment (scarp) is a steep slope separating horizontal or gently sloping areas of the seafloor. Firstly the shallowest soundings are selected (Figure 1 right, soundings 28 m and 29 m at the edge of the escarpment), then based on the width soundings positioned roughly in the middle of the selected shallowest soundings are selected on the deeper edge of the escarpment (Figure 1 right, soundings 38 m and 39 m). Other pairs in this sequence can be selected based on the length of the feature. This order of selection is designed to give information of the feature based on their importance.
Two joint elongated elements of slope can describe three narrow or V-shaped seafloor features: coastal channel (Figure 3), seafloor channel and a ridge.  (30)

u sredini elemenata
A narrow channel is something like an elongated hole, but despite representing a depression and unlike the hole it has significant navigational meaning because it is often used as a safe passage. It can be surrounded by land (coastal) or by relatively flat seafloor. The order of selection for a narrow seafloor channel is the following: the deepest sounding selected first (usually located in the middle where the elements are joined), then based on the width shallower soundings on the edges in line with the first sounding are added. Other sets are added based on the length of the feature. In Figure 3, upper left, we see that the whole channel area is covered in steep slope. But as we know, and can see from the profile (Figure 3, lower) these are two joint sloping elements facing each other representing one single seafloor feature. In case of a narrow coastal channel only the deepest soundings are selected based on the defined distance for a certain feature length (values 25 m, 30 m and 22 m in Figure 3, upper right). Ridge is an elongated elevation of varying complexity. It can pose a hazard for navigation, but because it is larger than a seamount it is usually easier to notice on a chart. The order of selection is basically the same as with the seafloor channel with the difference being the shallowest soundings come first (in the middle) and then the deepest soundings on the outer edges of the feature.
Two separated elongated elements of slope can describe three wide or U-shaped seafloor features: coastal channel, seafloor channel and a ridge. A wide seafloor channel is an elongated depression between two slopes (escarpments) facing each other and at the bottom of which lies horizontal or a mild sloping area of the seafloor. The order of selection for a wide seafloor channel is the following: four soundings are selected in line roughly perpendicular to the element, two shallower on the outer edges and two deeper on the inner edges of the sloping elements, depending on the width of the area between the elements a sounding in the middle of the area can be added. For a wide coastal channel, a triangular layout of three soundings are selected, two on the edges and one in the middle of the horizontal area as a minimum, with more added in between if width permits. The last feature is a ridge with a horizontal or a mild sloping area on the top. The order of selection is the following: the shallowest sounding is selected first, then depending on its location (in the middle or on the edge of the horizontal area) and the width two additional soundings are added in line, then the deepest soundings on the outer edges of the elements are be added.
In Figure 4 we can see an example of the automated selection for a wide coastal channel. In the northern part, sets of three soundings are selected, two on the edges of slopes indicating the outer border of the navigable part of the channel and the deepest sounding between the slopes. In the southern part, because the channel is wider, one more sounding is added in the set.

QUALITY ASSESSMENT / Procjena kvalitete
In this study, an approach was taken by employing predefined manual selections on official charts as the desired level of selection quality. These official selections are tested and compared with the defined algorithm and the dKart SSW whose features are previously explained. The focus is on the quality of sounding selection of distinct navigational features so two areas are chosen for testing. Both areas contain some seafloor features (channels, seamounts) that make them suitable for this analysis. The first area is part of a channel connecting Šibenik harbour and Lake Prokljan with a hydrographic survey of 920 420 soundings ( Figure 5, upper). The second area is the northern part of the Šibenik channel, an area between the island Lupac and the mainland with the hydrographic survey of 1 679 746 soundings ( Figure 5, lower). Both areas are represented on a 1:25 000 scale approach chart. In order to achieve the best results, the average distances between soundings on official charts (manual selection) for different depth areas are used as dKart SSW input parameters.
For the algorithm to carry the automated sounding selection, slope maps of the analyzed areas are generated, along with aspect data. Then the algorithm detected the defined significant seafloor features and depicted soundings from the survey according to the rules for each feature class. In order to test the quality of the selected soundings, surfaces are created via Kriging method of interpolation for both areas and all three methods (manual, slope and dKart). Ordinary Kriging method with linear semivariance model is used. These surfaces are then compared to the 'true' representation of the seafloor, the hydrographic survey, so that we can calculate which selection visualizes the seafloor better and witch automated method is more similar compared to the manual. In the future, the research should be expanded to include other automated selection methods. Vertical deviations are calculated for every sounding of the survey. Vertical deviation is defined as a difference in value (z) between the 'true' sounding (survey) and the approximated sounding in that same position (x, y) on the interpolated surface obtained from the results from manual and automated selections. The RMSE for vertical deviations from each method are summarized and compared.     Tables 1 and 2 show descriptive statistics and the analysis results for areas 1 and 2 data respectively.
The RMSE of vertical deviations of manual data was found to be 4.77 m (area 1) and 4.48 m (area 2), the automated data based on slope generated similar results 4.03 m (area 1) and 4.55 m (area 2) while the dKart data differs significantly in area 1 (6.92 m) and area 2 (8.11 m). The manual selection has more balanced results, while both automated data fluctuate more. The difference in area values for both automated data can be explained by a higher level of seafloor complexity in the second area. The accuracy for all methods was affected when measurements performed on the terrain characterized by slope values greater or equal than 10%. The average magnitude of errors is around three times higher for automated data and four times for manual data on terrains with slope values exceeding 10% compared to the areas where slope values are less than 10% in the first study area (2.12 m vs. 6.31 m for slope based data, 3.67 vs. 11.81 for dKart and 2.53 m vs. 8.11 m for manual data). Although the values of average errors are higher and the second study area is more complex, the same analysis showed smaller differences for all methods regarding the effect of the 10% (3.15 m vs. 7.16 m for slope based, 4.93 m vs. 13.08 m for dKart and 3.14 m vs. 6.93 m for manual data).
Strong correlation between automatic slope based data and manual data for area 1 (Figure 6, upper left) and for area 2 ( Figure 6, upper right) was revealed by the linear regression analysis. The slope value of the regression line for slope based data (0.9427 and 0.9616) was similar in both areas and closer to 1 than for dKart data (0.8837 and 0.9081). This reveals a slightly better agreement between slope data and manual data, than between dKart data and manual data. The determination coefficient r 2 is higher for slope based data than for dKart data, but in both cases it indicates a very strong relationship between both sets of data and manual data.
It can also be observed that points for slope based data are uniformly deviating from the regression line, while for the dKart data (because it is a shallow based algorithm) most points are visibly deviating under (are shallower) the regression line for both areas (Figure 6, lower left and right). It should be noted that the reason for discrepancies of both, slope and dKart data is not entirely in the imperfections of the selection algorithms, but also due to flaws of the manual selection. However, manual data was used for comparison because the main goal of this research is to automatize the selection while keeping the same level of quality.
The undersea features characterization enabled a method for their recognition for the study areas using bathymetry and its parameters. By using slope and aspect values, soundings can be utilized effectively for transference of important navigational information to the mariner regarding interesting seafloor features. The results of the analysis performed on two study areas show that the automated selection based on slope and aspect can match the manual selection in terms of detection of significant features and present a step forward compared to the shallow based automated feature in dKart Editor. The similarity between the proposed automated and the manual selection comes from the layout of the results. The proposed method is logically based on the principals behind the manual selection, and therefore is intended to mimic it with enhanced precision derived from slope data which are inaccessible during the manual process. Because the manual selection is made on hydrographic data without graphical information of slope values, some unnecessary soundings selected in the middle of the slope instead on the edge, have decreased the quality of its geographical representation of the seafloor. The complexity of the seafloor has an expected impact on the precision of its representation for all methods. But it must be stated that although the number of significant features higher in the second study area, both areas contained clearly separated features which makes their detection and thus visualization via soundings easier. In the future, the algorithm needs to be tested on more complex areas with multiple intertwined elements of slope. Also a special impact test of different variations of all the defined parameters within the algorithm could be conducted to enhance its quality of selection.