Mapping Land Suitability for Sugar Cane Production Using K-means Algorithm with Leaflets Library to Support Food Sovereignty in Central Java

Indonesia is the largest sugar importer country in the world, this is contrary to the government’s desire to realize sugar self-sufficiency. To overcome the dependence on sugar imports in order to support national food sovereignty, geographic information system technology (GIS) can be used to present information as material for consideration by the government in determining policies on the management of sugar cane land resources. The K-means algorithm is used to group regions according to production level, while the Matching method is for evaluating the suitability of sugarcane land. Presentation of data in the form of map visualization on the web using a new model in processing land data, where this model processes production grouping data, and land suitability class data in the form of GeoJSON then mapped with the help of Leaflets. This new model enables dynamic land data processing and visualization in the form of interactive maps. The results of the EUCS test for GIS mapping of Land Suitability and Cane Production are 3.23 (Satisfied) of the total score of 4, so this system can be accepted by the user.


Introduction
Sugar cane (Saccharum officinarum L.) is a type of plantation commodity as a raw material for making sugar. Based on the Decree of the Coordinating Minister for the Economy No. Kep-28 / M.EKON / 05/2010 concerning the Staple Food Stabilization Coordination Team, which includes staples as rice, sugar, cooking oil, flour, soybeans, beef, chicken meat, and chicken eggs [1]. As a source of calories other than rice, corn and tubers, sugar is one of the basic needs that is consumed in large quantities both from home to industrial scale [2]. The high level of consumption sugar which is not balanced with the amount of sugar production has resulted in Indonesia having to import to meet the demand for sugar consumption.
Based on data released by Statista, for the period of 2017/2018 Indonesia is the largest sugar importer country in the world with a number of sugar imports reaching 4.45 million tons. This figure beat China and the United States (US), which were 4.2 million tons and 3.11 million tons, respectively [3]. If this is not immediately addressed, Indonesia will suffer heavy losses due to having to depend on imports. Based on Table 1, Central Java is the third largest sugar production center after East Java and Lampung, so that Central Java is making a significant contribution to the sugar industry in Indonesia. However, sugar production in Central Java is still less than the government's target for sugar self-sufficiency. So that in 2018, the Governor of Central Java established the Central Java Provincial Regulation, number 1 Regarding Increasing Sugarcane Productivity, in which there was a program of empowerment of farmers by the local government, one of which was to provide or expand the area of sugarcane land [5].
According to Surono (2006), in a journal article entitled Sugar Self-Sufficiency Policy in Indonesia, one of the things behind the importance of sugar self-sufficiency in Indonesia is to maintain food sovereignty, because sugar is one of the main foodstuffs with a high level of need, so it needs to always available in sufficient quantities and reasonable price levels. To achieve the sugar self-sufficiency target, a method that is able to map the yield is needed to find out areas with less than maximum sugar cane production so that effective handling can be done [6].
One technology that can be used to map spatial data is Geographic Information Systems (GIS). The use of GIS in planning crop production management is needed in agriculture [7]. Government agencies such as the Ministry of Agriculture currently have a GIS for Sugar Cane Monitoring. However, the GIS only focuses on the composition of the area based on the growth phase of sugarcane, so the analysis of sugarcane production in each region is limited. With the K-means algorithm, a grouping of regions based on the level of sugarcane production can be done. In an effort to maximize the production of sugarcane, agricultural planning is also needed in accordance with the capabilities of the land. For land suitability assessment for sugarcane, it can be identified through land suitability evaluation by classifying potential land into S1 (very suitable), S2 (suitable) and S3 (marginal appropriate) classes [8].
Based on these problems, we conducted a study using Library Leaflets and OpenStreetMap to cluster regions based on the level of sugarcane production using the K-means algorithm and to map land suitability for sugarcane in Central Java Province, so that web-based GIS is produced which is expected to be able to present information that can used as a material consideration for the government to determine policies on the management of sugar cane land resources in order to realize food sovereignty in the case of sugar commodities.
Previous research that has a connection with this research, titled Clustering using K-means and Fuzzy C-Means on Food Productivity clustered the productivity of one food commodity namely rice using the K-means algorithm and Fuzzy C-means with Excel Software for processing data. Obtained three clusters with cluster 1 and cluster 2 having low productivity, so it can be concluded that the majority of rice productivity per province in Indonesia is classified as low [9].
Previous research, Implementation of K-means Algorithm for Mapping of Harvest Productivity in Karawang District applies a cluster technique using K-means algorithm to map rice harvest productivity data by dividing data into 3 groups: less than the target, according to the target, and exceeding the target using attributes rice planting and production area. The results of the mapping are visualized into a map on the web [10].
The study entitled Spatial Model Design of Landslide Vulnerability Early Detection with Exponential Smoothing Method Using Google API produces spatial models of early detection of landslide disasters based on rainfall data and soil condition data using the Single Exponential Smoothing method which is implemented using the Google API. This model is able to predict areas prone to landslides [11].
The study titled Cluster Analysis Using Fuzzy C-means and K-means Algorithms for Clustering and Mapping of Agricultural Land in Southeast Minahasa conducted cluster analysis to determine the area of agricultural land for paddy, paddy, corn and cassava commodities based on the attributes of harvested area, area planting, and production. The results of this study are the grouping of agricultural areas based on commodities and their attributes. Further studies need to be done by calculating the slope of the land, soil structure, compatibility of the commodity with the land [12].
The study entitled Evaluation of Land Suitability for Rice Commodities by Utilizing the Application of Geographic Information Systems (GIS) in Central Lombok Regency aims to determine the land suitability classes for lowland rice and upland rice in Central Lombok Regency based on topographic aspects, soil type and climate. The method used is the Matching method by adjusting the existing land suitability class criteria. The results of this study are visualization using ArcView GIS based on the suitability class for paddy and upland rice plants and their extent [13].
The renewal and superiority of this research compared to previous research is optimizing the process of presenting and processing land data by utilizing a new model in processing land data. With this new model, the data from the processing of the K-means algorithm to determine groups of regions based on the level of production and the Matching method to determine areas based on land suitability, can be converted into the form of GeoJSON. By utilizing Library Leaflet technology and OpenStreetMap as its basemap, the GeoJSON data is visualized in the form of an interactive web-based map, which makes data processing can be done dynamically and efficiently. Information generated from the combination of the visualization of production maps and land suitability is expected to be an input in the process of managing sugar cane land resources.
K-means algorithm is an algorithm that groups data into several groups based on similar characteristics, so that one group with another group has different characteristics. The function of an object in K-means can be determined by equation (1).
(1) note : d ij = Distance between object i and j P = Data dimension x ik = The coordinates of object i on dimension k x jk = The coordinates of object j on the dimension k Food sovereignty, cannot be separated from the potential of land resources for plant growth. Surveys and inventory of land resources need to be emphasized to support agriculture. However, this land resource survey is still limited. Land evaluation needs to be done on Land and Land Resource Data Data so that it can produce land suitability information [14].
One of the land suitability classification systems according to FAO (1976)  To determine land suitability must be based on plant growth requirements. Land suitability evaluation is carried out using the Matching method based on the requirements for growing sugarcane in Table 2.
The elements that are important for the growth of sugarcane are rainfall, sunlight, wind, temperature, and slope. Therefore in this study the parameters used are temperature, rainfall, and slope.

Method
The research process of mapping sugarcane production using the K-means algorithm and land suitability with the Matching method is carried out through several stages that are interrelated with each other. Following the research methodology chart shown in Figure 1.
-Stage 1: The first stage is to identify problems regarding the high import of sugar cane which is contrary to the government's desire for self-sufficiency in sugar. Then a literature study is conducted to find references to solve problems regarding the case.  While the data for regional grouping based on the level of sugarcane production uses variables in the form of planting area (ha), harvested area (ha), and production (tons) obtained from the 2015-2017 Indonesian Plantation Statistics. UML is used in making Use Case Diagrams, Activity Diagrams, and Class Diagrams for system design. The Usecase Diagram of the application to be built can be seen in Figure 2.  2 explained that the system has two users namely admin and user. Admin has access rights to manage sugarcane production data, manage land suitability data, manage users, perform cluster calculations for sugarcane production data, perform land suitability calculations, and view map visualizations. While users can only see maps that have been visualized on web pages. Class Diagrams are used to illustrate the structure of a system that is defined through classes according to the system requirements described in Figure 3. Based on Figure 3, the City Class contains attributes about cities or districts in Central Java Province along with polygon coordinates to map each area on the map. Production Data Class contains sugarcane production data. The Cluster class attribute contains the results of class grouping from the K-means algorithm. The Centroid Result Class contains the minimum data distance to the centroid while the Centroid Class Class contains the data grouping on the centroid. Conformity Data Class serves to accommodate the suitability data in each city that only has one Land Suitability data.  Figure 4 is an Activity Diagram when the admin does clustering on sugarcane production data. Previously, the admin must log in first. After successfully logging in, the admin can manage the data by selecting the Manage data menu. To cluster the sugarcane production data, the Admin enters the sugarcane production data management menu and chooses the calculate class cluster menu. Admin then chooses the year of the data to be processed, after that the system will perform K-means calculations on sugarcane production data in accordance with the year chosen by the Admin. The grouping data is then stored in a database.

Result and Discussion
The result of the system built is Geographic Information System (GIS) which is able to manage production data using the K-means algorithm and determine the land suitability class using the Matching method then visualize it on the web. The process of presenting data in the form of map visualization on the web uses a new model as illustrated in Figure 5.   Figure 5 is a land data management model where the input is sugarcane production data and land suitability data. Data management is done using the PHP programming language in the system. Cluster calculations use the K-means algorithm while evaluating land characteristics using the Matching method. The result of data processing is numerical data which is then stored in a database and processed by the system in the form of GeoJSON. Data that has been changed to GeoJSON is then displayed interactively using the Library Leaflet in the form of a map in the system so that the information can be analyzed by experts in land management.

1) K-means Algorithm and Matching Method
The K-means algorithm is modeled based on the K-means algorithm flowchart illustrated in Figure 6. Based on Figure 6, the K-means algorithm program flow chart is arranged. For initial point initialization the initial centroid is determined using the Simple Random Sampling method, which is taking random sample data as seen in pseudocode 1.

Pseudocode 1 K-means Algorithm Initialization of Center Point
Pseudocode 1 is the initial initialization process for the center point. The centroid center 1 is taken from the smallest production data center, the random centroid 2 data center, and the biggest centroid data center 3 production. Next is the calculation of the distance of each i-th data to the central point.

Pseudocode 2 K-means Algorithm Calculating Euclidean Distance
Euclidiean distance is calculated in lines 2 to 4 to measure the distance from the center of the cluster so that the distance obtained is hc1, hc2, and hc3. Then do the comparison of the distance of each class and then grouped into classes based on the minimum distance in lines 5 to 26. this group shows that the data has a distance from the nearest cluster center. After the members of each cluster are known, the next process is to determine the center of the new cluster.

Pseudocode 3 K-means Algorithm Calculating Central Centroid 1
Pseudocode 3 is a new centroid calculation process for centroid center 1. To calculate centroid centers 2 and 3, the same calculation is done with Pseudocode 3. The next process is checking the data that moves class.

Pseudocode 4 K-means Algorithm for Data Transfer Check
Pseudocode 4 is a process of checking data transfer. Comparison with the previous group was conducted. If there is a data transfer then the calculation is performed again on Pseudocode 2, while if there is no data transfer, the iteration process stops.
Matching methods to determine land suitability classes are arranged as a program flow model based on sugarcane growing requirements in Table 2, so that rules are obtained as seen in Pseudocode 5.

Pseudocode 5 Matching Methods
Pseudocode 5 is the result of implementation of the Matching method. Classification of land suitability classes is carried out through the process matching stage so that rules are obtained for each land suitability class. Line 1 is matched for all land data so that each land is classified in the appropriate land suitability class.

2) Mapping of Sugar Cane Production and Conformity
The process of managing data in order to be visualized in the form of interactive maps requires the help of Library Leaflets with OpenStreetMap as the base map. The following is a mapping process for sugarcane production data in 2015, which begins with entering the production data management menu and inputting production data in the form of harvested area, planted area, and the amount of production obtained from the 2015-2017 Indonesia Plantation Statistics. manage production data menu. In this menu, the admin can delete, edit and add production data. The admin can then calculate the cluster class from data in the selected year by selecting the calculate class cluster menu.  A description of the steps for manually calculating the K-means algorithm for sugarcane production data in Central Java in 2015 will be explained at this stage. The initial center of the cluster is composed of three clusters. Table 3 is the initial cluster center table where c1 is taken from the smallest production data, c2 random data, and c3 is the largest production data. The next step is to calculate the euclidean distance, i.e. the distance of each i-th data to the center point. The following is an example of calculating the distance of Banyumas city production data to the cluster center point.
After knowing the euclidean distance values, a comparison of the distance of each class is performed to determine the minimum distance to each cluster center.
Lowest distance = min(C1:C2:C3) = min(303.334:1485.048:52406.952) = 303.334 After knowing the minimum distance, group the data according to its cluster, that is based on data that has a minimum distance. After that, do a new centroid calculation by finding the average by adding up all the members of each cluster and dividing the number of members. Then do the same calculation to find the distance of data to the new cluster center point. The process will continue to repeat until there is no data transfer between classes. Table 4 is the final cluster result of sugarcane production data. The results of the K-means calculation in the application are the same as the K-means calculations done manually. The following are the results of the cluster in the application.    The clustered data is then processed by the system by changing the data in the database into GeoJSON. The data that has been changed to GeoJSON is then visualized into the OpenStreetMap map by using the Library Leaflet with a script like in Figure 9. Figure 9 is a piece of code Library Leaflet. In line 1 a OpenStreetMap map is called which is focused on the Province of Central Java. In line 8 a mapping is done based on GeoJSON that has been converted from the database. So we get results like Figure 10. Figure 10 is a display of the mapping of sugarcane production levels in Central Java Province in 2015, which has a color attribute based on the value of the cluster members in each region. The number of clusters used in the designed application consists of three clusters namely low, sufficient, and high. The color attribute on the map is one of the features that makes it easy for users to distinguish one cluster from another.

3) System Feasibility Analysis
The system feasibility analysis was carried out using the End User Computing Stations (EUCS) model with five respondents from the Central Java province's agriculture service staff. The variables used are Content, Accuracy, Format, Ease of Use, and Timeliness [16]. With assessment scores based on Likert Scale namely Very Satisfied (4), Satisfied (3), Dissatisfied (2) and Very Dissatisfied (1). The questions asked are as follows CONTENT: -GIS provide information correctly and correctly -GIS provide information as needed -GIS provides information that is easy to understand -GIS provides useful information for users ACCURACY : -GIS provide accurate information in accordance with the wishes of the user -GIS provide information in accordance with user access rights -GIS feedback results are in accordance with the functions on the website -GIS presents the results of data processing correctly and in accordance with user requirements FORMAT :  The level of user satisfaction with the Geographic Information System Mapping the Production and Conformity of the Cane is determined through the conversion of the scale of user satisfaction levels based on Table 5. Based on Table 5, the results of the EUCS test earned a score of 3.23 out of a maximum total score of 4, so that it is classified as satisfied. The score shows that the system is feasible and can be accepted by the user.

Conclusion
Based on the test results, the system has been able to map the area based on sugarcane production data in Central Java using the K-means algorithm into three groups namely low, sufficient, and high. The regions of Cilacap, Banyumas, Purbalingga, Banjarnegara, Kebumen, Purworejo, Magelang Regency, Boyolali, Klaten, Wonogiri, Grobogan, Demak, Semarang Regency, Temanggung, Kendal, Batang, Pemalang, Brebes and Semarang fall into the category of low production levels. The Karanganyar, Blora, Rembang, Kudus, Jepara, Pekalongan and Tegal Regencies are included in the production level. While the Region of Sragen, and Pati are areas that are classified as high production levels. As well as mapping land suitability data using the Matching method with suitability classes S1, S2, S3, N1, and N2. The EUCS test with a score of 3.23 (Satisfied) of the maximum total score of 4, shows that the system is feasible and can be accepted by the user.
The Department of Agriculture can make the Geographic Information System Mapping the Production and Suitability of Cane Land as a consideration for determining policies in sugarcane management both for sugarcane expansion and making efforts to overcome the inhibiting factors of sugarcane growth so that sugarcane production in areas that are not optimal can increase in the future. So with these efforts, it is expected that sugarcane production from year to year will increase so that it can realize sugar self-sufficiency in order to support food sovereignty in Central Java Province.
Further research, it is necessary to use a combination of other algorithms to further analyze sugarcane production such as using the Backpropagation Neural Network algorithm to predict sugarcane production in subsequent years in each region in Central Java Province.
Determination of land suitability classes for sugarcane can be refined further by adding other assessment parameters such as water availability, root media, nutrient retention, toxicity, etc.