Enhancing Learning Outcomes: A Study on the Development of Higher Order Thinking Skills based Evaluation Instruments for Work and Energy in High School Physics

This study aimed to enhance student learning outcomes in the field of work and energy within senior high schools through the development of evaluation instruments based on higher-order thinking skills (HOTS). Higher-order thinking encompasses advanced cognitive abilities such as analysis, evaluation


Introduction
Physics education at the senior high school level is an important part of preparing students to understand the basic concepts of physics and develop higher-order thinking skills.Rapid technological development makes humans live side by side with technology (D. A. Kurniawan et al., 2022).The importance of physics in the development of technology and the sustainability of human activity confirms that a positive attitude towards physics is an important thing for humans to have especially students as future successors of a nation (F.Kurniawan et al., 2023).However, in reality, student learning outcomes in physics are often low, especially in terms of work and energy.One of the factors that can influence learning outcomes is the lack of use of evaluation instruments that can encourage students to think at a higher level (Liu & Pásztor, 2022).The lack of use of evaluation instruments that are generally used in learning physics in high school tends to focus on factual knowledge and conceptual understanding, while higherorder thinking skills are often neglected.Minister of Education and Culture No. 65 of 2013 emphasizes the importance of using inquiry-based learning in the 2013 Curriculum to improve student learning outcomes.In this case, an evaluation tool is needed that is in accordance with the learning approach, namely an evaluation tool that focuses on higher-order thinking skills (HOTS).
Higher-order thinking involves students' ability to do analysis, evaluation, and creation.This ability is important in solving complex problems and applying physics concepts in real contexts.Newman and Wehlage revealed that higher-order thinking (HOT) requires students to manipulate information and ideas in ways that change their meaning and implications, such as when students combine facts and ideas to synthesize, generalize, explain, make a hypothesis, or reach a conclusion or interpretation.By using higher-order thinking, students will learn in more depth and understand concepts better (Liu & Pásztor, 2022;Tambunan, 2019).In addition, with higher-order thinking, students can differentiate ideas or ideas, present good arguments, solve problems, build explanations, and are able to hypothesize and understand complex things more clearly (Huang et al., 2022) The learning outcomes of students in the 2013 Curriculum are expected to be assessed through the use of a series of assignment instruments that involve activities of analyzing, evaluating, creating, connecting concepts, interpreting, providing appropriate arguments, and making decisions in problemsolving (Nurjanah, 2021;Waizah & Herwani, 2021).In order to become a quality HOTS measurement instrument, the instrument must meet valid and reliable criteria (Husna et al., 2020).In addition, the evaluation instrument must also function as a means to familiarize students with the types of HOTS questions (Asriadi & Istiyono, 2020;Husna et al., 2020;Setiawan et al., 2021) As a reference in developing HOTS evaluation instruments, the PISA model test from the Program for International Student Assessment (PISA) developed by the OECD has been widely used as an international standard (Asriadi & Istiyono, 2020).The PISA test emphasizes students' ability to reason and solve problems (Agasisti et al., 2023).However, the research results show that students' abilities in reasoning, analysis, evaluation, and creation using PISA questions are at low and medium levels (Robertson, 2021).This is caused by the lack of student practice in solving HOTS questions.Therefore, it is necessary to develop an assessment instrument specifically designed to train students' HOTS abilities.
In the context of learning physics, the use of appropriate evaluation instruments has an important role in increasing student participation in the learning process, developing critical thinking skills, and being able to apply physics concepts in real situations.
Appropriate evaluation instruments will allow students to be actively involved in the learning process.By involving students in activities such as analyzing, evaluating, creating, connecting concepts, and interpreting information, evaluation instruments can encourage students to think deeply about the physics concepts being studied (Arafah et al., 2021).
In addition, the right evaluation instrument will also encourage the development of students' critical thinking skills.Critical thinking skills involve students' abilities to analyze information, evaluate arguments, identify assumptions, and construct logical thinking.By using evaluation instruments that require critical thinking, students will be trained in dealing with intellectual challenges, asking in-depth questions, and presenting arguments based on evidence and logical thinking (Rahman et al., 2021).Furthermore, appropriate evaluation instruments can also help students apply physics concepts in real situations.Through questions and assignments that require students to use physics concepts to solve problems or deal with real situations, students will be better trained in connecting abstract concepts with the real world.This allows students to see the relevance and practical application of the physics concepts being studied, thereby increasing their overall understanding.
By using appropriate evaluation instruments, students will not only become passive recipients of information, but also become active and cognitively engaged learners.They will be able to develop critical thinking skills needed in everyday life and be able to apply physics concepts in a variety of real situations.All of these will have a positive impact on students' understanding, achievement, and interest in learning physics (Wisman et al., 2021) Therefore, this study aims to develop an evaluation instrument based on higher-order thinking that can improve student learning outcomes in the subject matter of work and energy in senior high schools.By using appropriate evaluation instruments, students are expected to be more involved in the learning process, develop critical thinking skills, and be able to apply physics concepts in real situations.
This study will analyze the effectiveness of using evaluation instruments based on higher-order thinking in improving student learning outcomes in the subject matter of work and energy in senior high schools.The results of this study are expected to provide practical recommendations for teachers and curriculum developers in improving evaluation approaches that can encourage higher-order thinking and improve student learning outcomes in physics.

Method
The research method that will be used in this study is the research and development method (Research and Development) which aims to produce and test the effectiveness of a product (Sugiyono, 2013).The product to be studied is a HOTS based evaluation instrument that is used to measure students' metacognitive abilities in understanding work and energy material.The approach used in this method is ADDIE, which consists of five The evaluation instrument that has been developed will be implemented in learning physics for high school students.Teachers and students will be involved in using this instrument to measure students' metacognitive abilities and higher-order thinking skills in the subject matter of effort and energy.e. Evaluation: After the evaluation instrument is used in learning, the data obtained will be evaluated.The evaluation will involve statistical analysis to measure the validity and reliability of the evaluation instrument.In addition, a comparative analysis will be carried out between the evaluation results before and after the use of the instrument to measure the increase in student learning outcomes.f.Improvement: Based on the results of the evaluation, the evaluation instrument can be further refined to increase its effectiveness and validity.Instrumental revisions and improvements can be made to support the development of Higher-Order Thinking Skills (HOTS) and improve student learning outcomes in the subject matter of effort and energy.
This research method will provide an indepth understanding of developing evaluation instruments based on Higher-Order Thinking Skills (HOTS) to improve student learning outcomes in the subject matter of effort and energy.The writing on this instrument is easy to read. 1 2 The images presented are attractive and easy to understand. 1 3 The presented image can help solve the question. 1 4 The use of communicative and easy-to-understand language.1 5 Available instruments are in accordance with business and energy materials.1 Appearance 6 Design the appearance of attractive learning media with the right colors. 1 7 The text in the HOTS evaluation instrument is easy to read. 1 8 The images contained in the HOTS evaluation instrument are in accordance with the material presented.

9
The suitability of the language used with the rules of Indonesian.Learning with HOTS evaluation instruments makes physics learning enthusiastic 1 18 The HOTS evaluation instrument encourages interest in learning physics. 1 19 The HOTS evaluation instrument made me understand more about work and energy 1 20 The HOTS evaluation instrument can help me improve my Higher-Order Thinking Skills 1

Result and Discussion
The result of this study is the development of an evaluation instrument based on Higher-Order Thinking Skills (HOTS) using a paper-based test as a medium to measure students' abilities in the subject matter of work and energy in learning physics in high school.The research method used is the research and development method using the ADDIE model which consists of five stages, namely Analysis, Design, Develop, Implementation, and Evaluation.
During the Analysis phase, a needs analysis is conducted to determine the research objectives, identify weaknesses in existing evaluation instruments, and assess the needs of students.The results obtained in this phase involve the development of High Order Thinking (HOT) instruments based on the learning achievements of high school students.Literature findings indicate that the selection of question characteristics is based on the needs of students to have critical and creative thinking abilities when facing everyday life challenges (Utomo, 2023).Questions containing elements of high-level thinking or HOTS are expected to train students to think critically and creatively when they encounter challenges presented in the questions.Furthermore, the selection of question characteristics that encourage highlevel thinking becomes increasingly important in improving the quality of education in Indonesia skills encompass several aspects.One of these is (Hartini & Martin, 2020;Naibaho & Ritonga, 2023).Indicators for measuring high-level thinking (Table 2) are the ability to analyze (C4), which includes the skill of breaking down concepts into smaller components and connecting them to gain a comprehensive understanding or a more comprehensive concept.The ability to analyze allows an individual to break down problems or information into more easily understood parts (Hauptman & Cohen, 2011).Another indicator is the ability to evaluate (C5), which involves the capacity to determine the degree of something based on norms, criteria, or specific benchmarks.This enables a person to make precise judgments regarding information or situations by considering specific standards (Forawi, 2016).The third indicator is the ability to create (C6), which involves the capacity to combine elements into something new, complete, and comprehensive (Setlik & Silva, 2023).This includes the ability to create new solutions, ideas, or concepts that go beyond simply combining existing elements (Latifah et al., The Design stage, which is the second stage, involves the design of Higher-Order Thinking (HOT) based evaluation instruments consisting of 20 multiple-choice questions, 10 essay questions, and an assessment rubric.These instruments are designed to stimulate students to think at a higher level.The outcome of this stage is an evaluation tool that will be used to measure students' understanding and high-level thinking abilities in the context of learning the work and energy material.The design of the instruments involves various steps, including the formulation of questions that promote analysis, evaluation, and creativity.Additionally, an assessment rubric is developed to assist in assessing students' responses objectively and consistently.Theoretical analysis is used as the basis for the design of these instruments.This includes a review of relevant literature and a conceptual framework to ensure that the instruments align with the research objectives and learning needs (Kurniawan et al., 2023;Zakwandi & Istiyono, 2023).Thus, the Design stage is a crucial step in developing HOT-based evaluation instruments, aimed at measuring students' high-level thinking abilities and promoting deeper learning and critical and creative thinking-oriented education (Amaliyah, 2023).The development stage involves the creation of evaluation instruments based on the previously established design.During this stage, the products developed by the researcher are validated by several validators, including subject matter experts, media experts, and teachers.The purpose of validation is to determine whether the products are suitable for use or require improvements.The instruments must also undergo expert review and validation to ensure their quality, relevance, and effectiveness in measuring high-level thinking skills (Hikmawati et al., 2023).Validation is a crucial step to ensure that the evaluation instruments are reliable and accurately measure the intended learning outcomes (Linuwih & Safutra, 2023).It helps identify areas that may need improvement and provides valuable feedback for refining the instruments before they are used in an educational context.

a. Validity Analysis
The validity used in this study includes Content Validity and Construct Validity.Content validity is used to measure the extent to which the instrument comprehensively covers the expected subject matter (Sun et al., 2022).In the context of high school physics, the HOT evaluation instrument should effectively encompass the topics of work and energy.To ensure content validity, the instrument must be designed considering essential components of the subject matter and Higher-Order Thinking aspects like analysis, evaluation, and creation.In this context, the construct measured is the ability of Higher-Order Thinking to understand and apply work and energy concepts.
Evaluation instruments function as a measure to assess knowledge achievement after a learning process.According to the Minister of Education and Culture Regulation (Permendikbud) No. 69 of 2013, various forms of assessment can be used, including self-assessment, authentic assessment, tests, portfolio-based assessment, and the like (Desiriah & Setyarsih, 2021).The feasibility of an assessment instrument can be viewed from several aspects, such as the level of validity of the instrument, consistency/reliability of the instrument, level of difficulty, distinguishability, and the ability to distract question answers (Desilva et al., 2020).An instrument is considered suitable and usable if the results of expert validation fall within the "suitable" and "very suitable" categories (Weisdiyanti & Juliani, 2023).The development of the evaluation instrument involved subject matter experts competent in the field of physics.They evaluated the test items to ensure they covered various relevant aspects of work and energy in alignment with the curriculum.

Figure 1. Expert Validation Results
Figure 1 shows the latest research results that have been validated by material experts.Assessment by material experts resulted in a high score, namely 86.25%, with the criteria "Very Suitable".Likewise, assessment by other material experts resulted in a score of 77.50%, also with the criteria "Very Suitable".This finding is consistent with previous findings by Mantau, which reported a score of 85% on the same criteria (Mantau & Talango, 2023).Likewise, Wardhani found a result of 80% with the "Very Appropriate" criterion in his research which focused on a similar evaluation instrument (Wardhani & Setiyarsih, 2021) Apart from that, Suciati, which includes the content validity of similar evaluation instruments at the national level, also showed positive results with a score of 88% (Suciati, 2022).Content validity analysis also supports these findings by showing that the evaluation instrument effectively covers core concepts and applications in the subject matter.The results of this research provide a strong basis that the evaluation instruments used cover the subject matter in question well.In addition, testing the instrument on a group of students who represent the population to be assessed, as done by (Bahufite et al., 2023), adds to the reliability of the results of this research by ensuring that the instrument can be applied effectively in learning contexts in the field.Figure 2 show the data from this trial was used for construct validity analysis.User assessments (e.g., by teachers or students) yield a score of 78.75% with the criteria of "Very Suitable."The results of the construct validity analysis indicate that the HOT evaluation instrument can effectively measure the Higher-Order Thinking abilities of students related to the work and energy topics in high school physics.This is supported by factor analysis indicating that the test items in this instrument contribute to measuring Higher-Order Thinking skills (Masturi et al., 2023).In conclusion, the HOT evaluation instrument developed has successfully passed content and construct validity analyses with positive results.Content validity ensures that the instrument effectively covers relevant subject matter, and construct validity shows that the instrument is capable of measuring the Higher-Order Thinking abilities of students related to work and energy in high school physics (Zhou et al., 2023).This affirms that the HOT evaluation instrument aligns with the measurement objectives and possesses strong validity (Masturi et al., 2023).

b. Reliability Analysis
Reliability analysis is used to assess the extent to which the developed evaluation instrument can provide consistent and dependable results over time.In the context of developing the Higher Order Thinking (HOT) evaluation instrument for work and energy in high school physics, reliability is crucial to ensure the instrument consistently measures students' abilities (Martawijaya et al., 2023).A common method for measuring instrument reliability is using the Cronbach's Alpha coefficient.The Cronbach's Alpha value ranges from 0 to 1, where higher values indicate higher reliability.An instrument is considered to have good reliability if its Cronbach's Alpha value is close to or greater than 0.7 (Muflikhun & Setyarsih, 2022).In the reliability analysis of the HOTS evaluation instrument for work and energy in high school physics, data from the instrument's trial with a group of students is used.This data includes student scores on each test item within the instrument.Based on Table 3, the results of the reliability analysis using Cronbach's Alpha coefficient show a value of 0.81.This indicates that the HOTS evaluation instrument has good reliability because its Cronbach's Alpha value is greater than 0.7.In other words, the instrument consistently provides dependable results in measuring the Higher Order Thinking abilities of students related to work and energy in high school physics (Abdurrahman et al., 2023).In conclusion, the results of the reliability analysis demonstrate that the HOTS evaluation instrument consistently provides dependable results in measuring Higher-Order Thinking abilities (Istiyono et al., 2023).This affirms that the instrument can be effectively used for evaluation in the context of high school physics learning in the work and energy topics (Arifin et al., 2023).
An instrument is said to be feasible and can be used if the validation results by experts are in the feasible and very feasible categories (Weisdiyanti & Juliani, 2023).Based on the image, the research results that have been validated by subject matter experts show that the assessment by media experts yields a score of 86.25% with the criteria "Highly Suitable," while the assessment by material experts results in a score of 77.50% with the criteria "Highly Suitable."The assessment by users (e.g., teachers or students) yields a score of 78.75% with the criteria "Highly Suitable."From these results, it can be concluded that the development of Higher-Order Thinkingbased evaluation instruments for teaching physics, specifically the topic of work and energy, in high schools, is considered highly suitable and effective.
In the Implementation stage, the evaluation instruments that have been developed are applied in high school physics education.During this implementation, the data collected will be used for evaluation in the subsequent stage, the Evaluation stage.In the Implementation stage, the evaluation instruments that have been designed are used within the context of high school physics education.The data collected during the implementation process involves student feedback, test results, and observations, which will serve as the basis for further evaluation (Blegur et al., 2023).The Evaluation stage is conducted for several primary purposes.First, it aims to measure the validity of the evaluation instruments, which is the extent to which the instruments can measure what they are intended to.This assesses whether the instruments align with the learning objectives and to what extent the evaluation results reflect students' true understanding and abilities (Firdaus et al., 2023).Second, evaluation also aims to measure the reliability of the evaluation instruments (Vázquez-Villegas et al., 2023;Xu et al., 2023).Reliability refers to the extent to which the instruments consistently measure a concept or skill over time.The evaluation results can help assess whether the instruments provide consistent and dependable results.Furthermore, the Evaluation stage is also designed to assess whether the use of Higher-Order Thinkingbased evaluation instruments has led to an improvement in students' higher-order thinking abilities (Weisdiyanti & Juliani, 2023).This provides insight into the effectiveness of the instruments in promoting and measuring an enhancement in students' thinking skills.In summary, the Implementation and Evaluation stages play a crucial role in evaluating the evaluation instruments, ensuring their quality, and measuring their impact on student learning (Kurniawan et al., 2023).The results from the Evaluation stage provide valuable information for refining Higher-Order Thinking-based evaluation instruments and enhancing physics education in high schools.Based on the Figure 3, assessment of media experts, the evaluation instrument based on Higher Order Thinking Skill (HOTS) in learning physics of work and energy material in high school for class X students is considered feasible overall with an average score of 73.67%.So, we need learning media in the form of teaching materials that can make it easier for students to understand a learning material (Astalini et al., 2021).Adaptive learning media can empower students' HOTS with good assessment scores (Sulistyanto et al., 2022(Sulistyanto et al., , 2023)).This assessment includes aspects of display feasibility, content feasibility, and language feasibility.Furthermore, based on the assessment of material experts, the evaluation instrument was considered very feasible with an average score of 84.72%.Apart from these findings, other research findings that are relevant to the development of Higher-Order Thinking-based evaluation instruments for physics learning in high school can provide additional support.For example, the Pinto Research conducted by (Pinto et al., 2023) Likewise, Istiyono's research conducted by (Istiyono, 2017) concluded that such instruments were able to improve students' understanding of high-level physics concepts.By considering these findings together, it can be concluded that the development of a Higher Order Thinkingbased evaluation instrument for physics learning, especially on the topic of work and energy in high school, is considered very appropriate and effective.These findings are consistent with previous research results, confirming the validity of the instrument and providing a strong basis to support its effectiveness in improving physics learning at This assessment also includes aspects of display feasibility, content feasibility, and language feasibility.Apart from the findings in this research, several previous studies also recorded positive results related to the development of Higher-Order Thinking Skill (HOTS) evaluation instruments in physics learning at various school levels.One of them is a study by (Sugiarti et al., 2017a).The results of the research show that the characteristics of the critical thinking skills assessment instrument are open-ended.This instrument meets several indicators, namely analyzing arguments, deduction, induction, and displaying information in the form of scenarios, text, graphs, and tables.The research instrument developed has the characteristics of being a useful instrument and meets the requirements used to measure it.This is proven by the results of data analysis which confirm that the instrument has achieved content validity based on expert assessment and obtained empirical evidence, both as classical test theory and the Rasch model.Research by (Ramadhan et al., 2019).The research results show a validity value of 3.94 on a rating scale of 3.0 ≤ SV ≤4.0 with the correct category and can be implemented with minor revisions.The practicality of the assessment was obtained from the readability of the assessment instrument with a percentage of 81%, the difficulty level of the instrument with a percentage of 72%, and respondent responses with a percentage of 83%.The effectiveness of the assessment is obtained from learning completeness, where classical completeness is 100%, and indicator completeness is 90.5%.Thus, the assessment of high-level physics thinking skills based on local wisdom can be said to be worthy of dissemination.
In addition, (Bahtiar et al., 2020) also carried out similar research regarding the HOTS instrument in physics learning at the secondary level.They evaluated the HOTS instrument for the concepts of gravity and mechanical energy.The results of this research indicate a significant increase in students' higher-order thinking abilities as well as better learning outcomes.Additionally, (Sugiarti et al., 2017b) took a similar approach in developing a HOTS instrument for high school students in the context of thermodynamics.The results of this research confirm that the use of the HOTS instrument can encourage students' abilities to analyze, evaluate, and apply the physics concepts they learn in real situations.
In the context of developing the HOTS evaluation instrument for work and energy material in high school physics, the results of this previous research provide a strong basis and support that this kind of instrument is effective in increasing students' understanding of the material and higher level thinking abilities.Your research continues this positive trend by producing a feasible and effective evaluation instrument for physics learning purposes at the high school level.
In addition, the assessment of teaching experts also stated that the HOT-based evaluation instrument for studying the physics of work and energy in high school was very feasible, with an average score of 77.50%.This assessment also includes aspects of display feasibility, content feasibility, and language feasibility.Thus, based on the assessment of the three groups of experts, the evaluation instrument is considered appropriate for use.The results of the analysis were also carried out based on user  1).Based on the calculation of the score of 20 questions, an average percentage of 82% of the product being developed is obtained.This shows that students gave a positive response to the HOTbased evaluation instrument in learning physics on work and energy in high school (Dewi et al., 2023;Juita et al., 2023).Based on the conclusions from the expert assessment and student responses, it can be concluded that the HOT-based evaluation instrument in learning physics on work and energy in high school is very appropriate to use.
By using the ADDIE model and going through these stages, this study produced a HOTS-based evaluation instrument that could be used in learning physics on work and energy in high school.The results of the development at each stage of the research made an important contribution to improving the quality of learning physics and students' higher-order thinking skills in understanding and applying physics concepts in real situations (Mukti et al., 2023;Mulyana & Desnita, 2023).

Conclusion
In this study, the development of an evaluation instrument based on Higher-Order Thinking (HOT) was carried out for learning the physics of work and energy in senior high schools.The research and development method used is the ADDIE model which consists of five stages: Analysis, Design, Develop, Implementation, and Evaluation.The results of the development of this evaluation instrument show the feasibility and effectiveness of its use.Based on the assessment of media experts, material experts, and publisher experts, this evaluation instrument as a whole is considered feasible to use.The assessment includes aspects of display feasibility, content feasibility, and language feasibility.Student responses also showed positive results for this evaluation instrument.With the HOT-based evaluation instrument, students are expected to be more involved in the learning process, develop critical thinking skills, and be able to apply physics concepts in real situations.These skills are expected to keep pace with global world demand in the future (Listiaji et al., 2022).This evaluation instrument can also assist teachers in measuring students' metacognitive abilities in understanding work and energy material (Berry Devanda et al., 2023;Sari et al., 2023).Thus, the development of HOTS learning evaluation instruments on the material physics of work and energy in high school has the potential to improve student learning outcomes.This research makes an important contribution to the development of a more comprehensive evaluation approach and is oriented towards higher-order thinking in eyeglass learning at the senior high school level.

Study on the Development of Higher Order Thinking Skills based Evaluation Instruments for Work and Energy in High School Physics
1  10The images used are clear and proportionate.1Enhancing Learning Outcomes: A

Enhancing Learning Outcomes: A Study on the Development of Higher Order Thinking Skills based Evaluation Instruments for Work and Energy in High School Physics the
high school level.From these results, it can be concluded that the development of a Higher Order Thinking based evaluation instrument for physics learning, especially the topic of work and energy in high school, is considered very appropriate and effective.

Enhancing Learning Outcomes: A Study on the Development of Higher Order Thinking Skills based Evaluation Instruments for Work and Energy in High School Physics assessments
, namely students, by looking at the student response questionnaire data recapitulation diagram (Table