Analyzing Customer Engagement with Gamification Approach in the Banking Sector Using the Rasch Model

This study investigates the instruments used to measure consumer engagement in mobile banking applications that employ gamification strategies. Customer engagement plays a crucial role in the effectiveness of e-marketing strategies, particularly for relationships, products, services, and brands. Gamification has emerged as a promising approach for promoting content marketing. Primary data from active users of mobile banking applications were collected from various banks and analyzed using the Rasch model to assess the instrument's efficiency. The study involved 451 participants and considered demographics, customer involvement, psychological aspects, game elements


INTRODUCTION
The rapid advancement and widespread adoption of digital technology, particularly through mobile applications [1], have revolutionized numerous sectors, including banking field [2].Mobile banking applications epitomize this digital transformation, offering unprecedented convenience by enabling transactions without physical cash or paperwork, thereby reshaping the business landscape and influencing social dynamics [3].In recent times, mobile banking usage has experienced a remarkable and impressive surge, as evidenced by the substantial escalation in global mobile application downloads.The meteoric rise in mobile application downloads, from 140.7 billion in 2016 to 204 billion in 2019, underscores the global transition towards digital banking solutions.However, this shift also unveils a significant challenge: sustaining user engagement.With data revealing that only 32% of mobile application users engage with an app more than ten times, and 25% abandon it after a single use, the banking sector faces a pressing need to reimagine strategies for retaining customer interest and participation [4].However, this shift also unveils a significant challenge: sustaining user engagement.With data revealing that only 32% of mobile application users engage with an app more than ten times, and 25% abandon it after a single use, the banking sector faces a pressing need to reimagine strategies for retaining customer interest and participation [5].
These statistics clearly indicate the limited level of user engagement with mobile apps.Consequently, businesses operating these apps encounter a significant challenge in sustaining user engagement [6].Gamification, the integration of game elements into non-gaming contexts [7], emerges as a potent strategy to enhance user engagement [8] by leveraging psychological drivers to foster deeper interaction with mobile banking applications [9].Despite the promising potential of gamification, empirical research exploring its impact on user engagement within mobile banking remains sparse [6].
A deeper understanding of app engagement is provided by recent study by Ho and Chung [10].In addition, previous literature was limited because many studies discussed gamification only as a context [11] and did not establish a connection to any theories already in use to explain how gamification elements drive motivational processes [12].Recognizing the distinctive qualities of every individual, gamification systems can help provide personalized solutions for individual users and can increase their engagement and satisfaction [13][14][15].
This gap is particularly pronounced in understanding how gamification can transform user behaviour and elevate the banking experience.Our study addresses this lacuna by focusing on the effectiveness of gamification in boosting customer engagement in the banking sector, an area of critical importance given the competitive and fast-evolving nature of digital finance.
Studies examining the use of gamification in banking have found that it significantly improves system performance and consumer engagement [16,17].Furthermore, the mean values of INFIT and OUTFIT MNSQ, along with INFIT and OUTFIT ZSTD, exhibit a close approximation to the desired threshold.Moreover, the instrument exhibits a strong level of reliability, as evidenced by a coefficient alpha of 0.94 [18].Rodrigues' (2016) research on gamification in e-banking, measures the absolute fit index by determining how well the model is adapted to the sample data using SEM (Structural Equation Modelling) modeling.As a result of the findings, ebanking use is influenced by both simplicity of use and enjoyment [19].Conducting assessments of convergent validity plays a pivotal role in the measurement model by evaluating the extent to which the indicators align with a specific variable.The evaluation of discriminant validity entails the utilization of the square root of the Average Variance Extracted (AVE) [17].
The findings indicate reliable discriminant validity, supported by correlations among each dimension that are below the square root of the Average Variance Extracted (AVE) [20].Another study was conducted by Aydınlıyurt et al. [21], using Cronbach's alpha and AVE to measure individual behavioral influence instruments in gamification mobile applications.Hamidi and Safareeyeh's [22] study utilized SPSS and Cronbach's alpha to assess how employing CRM systems in m-banking adoption affected customer happiness and engagement, which is said to be the most crucial element in the banking industry's performance.Dzandu et al., (2022) According to his research, gamification of mobile money payments can increase consumer value by having a positive social impact.Measurement tools were used in Dzandu's study together with the SPSS method and SEM-PLS (Structural Equation Modeling-Partial Least Squares).The research demonstrates a strong and favorable connection between customers, social impact theory, and gamified mobile money payments (GMMP) [23].In his study, Nasirzadeh and Fathian [18] used SPSS statistics to assess data on how gamification may be customized for people with various demographic and psychological features, particularly in the banking industry.
Among the various measurement options available, the Rasch measurement is a valuable approach.Its primary advantage lies in its ability to assess whether a scale measures independently of a particular sample.Moreover, it provides valuable insights at both the scale and item levels, enabling the adaptation of existing scales to specific research contexts [24].The Rasch model, a mathematical framework, utilizes probability estimation to evaluate the measurement characteristics of a rating scale [25].The Rasch model, being a latent logistic model, prioritizes precise measurement.It posits that an individual's likelihood of providing a correct response to a test item is influenced by their ability claims and the item's level of difficulty, both derived from empirically validated data [26].Rasch analysis aims to enhance the precision of evaluations for both individuals and items, thereby making valuable contributions to various aspects of validity and accuracy [27].Rasch's model characterizes respondent proficiency (people ability) and difficulty as latent variables [28][29][30].It presupposes that the item's slope or discrimination parameter would be constant across all items (i.e., how well the item can identify variations in one's talents) [25].One advantage of the Rasch model, as opposed to the Item Response Theory (IRT) model, is its capability to facilitate comparisons between individuals' abilities and the difficulty of items, irrespective of the specific sample or set of items used [31].Furthermore, the Rasch measurement model provides more comprehensive insights into products, models, and the fit with human attributes.By enabling prompt identification of certain measurement issues, the Rasch measurement model acts as a valuable complement to traditional test theory and approaches based on Item Response Theory (IRT) [32].Rasch analysis can be utilized for evaluating the psychometric properties of existing instruments and developing new ones [33].Rasch analysis, which is mostly used for instrument creation, shows benefits in terms of item reduction [30].Research conducted by Brush and Soutar [34] investigated property scale measurements and invariance using Rasch analysis in tourism and recreation studies.
Researchers recommend investigating using Rasch whether the measurement invariant assumption holds, before conducting substantive testing [34].According to Stolt et al. [35], the application of Rasch analysis in the nursing profession serves as a tool that lends support to the validation of tools created in nursing science.It is definitely advisable to read Rasch's study of nursing science.In the study by Goswami et al. [36], Rasch analysis was used to evaluate the dimensions, model data fit, item difficulty, individual stigma, distribution of items and individuals across items-people, and assessment scale functions for each HPASS subscale (Stereotypes, Discrimination, and Prejudice).According to Hergesell [24], Rasch's measurement is to overcome several weaknesses inherent in the Classical Test Theory (CTT) in research in the field of tourism.Opportunities to improve data compliance with Rasch models and fine-tune scale are provided.In research conducted by Soeharto and Csapó [37], the results of Rasch's analysis showed that students adapted inductive reasoning tests met the criteria of validity and reliability based on Rasch's parameters.Rasch analysis is the best testing method since it provides so much data regarding individual items, their dependability, and the suitability of response formats [38].
This paper examines several variables encompassing demographics, customer engagement, psychological factors, and game elements.Prior studies have identified a range of game elements employed in the banking sector, comprising a total of 21 essential elements.Elements include announcements, points, awards, ratings, badges, scores, tasks, feedback, leaderboards, bid hunt, timers, levels, shares, social interactions, penalties, avatars, lotteries, virtual prizes, epic meanings, informative content, and random prizes.These game elements have been recognized for their potential to increase customer engagement and promote desired behavior in a banking context [39].The 26 dimensions consist of demographics, customer engagement, self-efficacy, accountability, belongingness, announcements, points, awards, ratings, badges, scores, tasks, feedback, leaderboards, bid hunt, timer, levels, share, social interaction, penalties, avatars, lotteries, virtual prizes, epic meanings, informative content and random prizes.
Furthermore, this research pioneers the application of the Rasch model to evaluate the measurement tools used to assess customer engagement in gamified banking environments.The Rasch model, with its robust mathematical foundation, offers a unique lens through which to examine the alignment between user engagement levels and gamification elements, ensuring precise, scalable, and interpretable measurements.By employing the Rasch model, we aim to provide a nuanced understanding of customer engagement dynamics in mobile banking, contributing valuable insights to both academic discourse and practical applications in the banking industry.
In summary, this study not only explores an innovative approach to enhancing customer engagement through gamification but also introduces a sophisticated analytical framework for measuring engagement levels.By bridging these two pivotal areas, our research endeavors to enrich the literature on digital banking and gamification while offering actionable strategies for banking institutions to captivate and retain customers in the digital age.

Overview of the Rasch analysis
The Rasch model, a mathematical framework, utilizes probability estimates to evaluate the reliability and precision of rating scales, offering a robust methodology for assessment [25].Considering the nonlinearity of raw scores, it is not reasonable to assume that the difference between two consecutive raw scores represents a consistent interval.This is one of the reasons why the Rasch measurement technique is employed.Some of these concerns are addressed by Wright [40] in their discussion.In an assessment of social skills using the Likert scale, let's consider the raw scores obtained by three individuals.Jim obtains a raw score of 20, Sue obtains a raw score of 22, and Jen obtains a raw score of 32.Based on the coding of raw score points, where Strongly Agree is assigned a code of 4, Agree as 3, Disagree as 2, and Strongly Disagree as 1, we can deduce that Jen demonstrates superior levels of social skills in comparison to the other alternatives.This inference is based on the fact that Jen's raw score corresponds to the highest raw score category, indicating stronger social skills.Rasch analysis employs a logit scale to rank the difficulty of items and the abilities of respondents, where higher scores indicate higher levels of ability or difficulty, while lower scores indicate the opposite.Before performing a Rasch analysis, there are a few preliminary steps that must be taken into account, including: item size, person size, Wright Map, match statistics, Rasch reliability index, and rating scale analysis.To demonstrate the practical application of Rasch's analytical methodologies in school psychology, this study illustrates their implementation by creating and piloting a selfreport rating scale [33].A logit scale was used in Rasch's study to rate participant ability and item difficulty, with higher scores denoting greater ability or difficulty and vice versa [41].

Gamification
Gamification is the strategic integration of games into nongaming contexts or activities, aimed at enhancing customer engagement and fostering long-term retention [42].The majority of studies investigating the use of gamification in banking have come to the conclusion that it significantly improves system performance and increases consumer engagement.The use of gamification to a variety of industries, including banking, will provide outstanding outcomes [16,17].This is because there is a clear link between gamification and the desire to utilize mobile banking services, which is supported by their widespread use.By employing a gamification strategy and creating suitable financial services, designers have the opportunity to enhance the appeal of banking operations.This, in turn, can lead to increased customer satisfaction and engagement.By comprehending essential concepts and effectively implementing a balanced selection of gamification techniques, it becomes possible to attain widespread consumer acceptance and value [16].In a study by Rodrigues et al. [17], they developed a conceptual model to examine the effects of gamification on bank websites, and they demonstrated that it significantly affects web page design elements, user friendliness, and intent to use e-banking, among other things.Rahi et al. [43] study offered a solution for gamified internet banking and noted that by using gamification to make internet banking websites enjoyable and profit from the reward system, users' intentions to adopt the system and recommend it to others will improve social networking.According to related studies, gamification significantly enhanced customer interactions, use intention, and interest in social connections.For predicting behavioral intent in gamified commercial apps and linked enterprises, the researchers put out a conceptual model [44].

Demographic information
Previous research employed gender and age when using demographic data for various people in gamification systems.Orji [45], says that gender influences behavior, and men and women view behavior modification tactics differently.Lastly, Orji proposed gender as a reliable strategy and recommended the adoption of a gender-specific approach for design purposes.In a correlated investigation, Orji et al. [46] demonstrated that the techniques they employed were more effective in persuading women.
Oyibo et al. [47] investigated the role that culture plays in the influence of age and gender on the social influence of persuasive technology and demonstrated that there are significant differences between women and men and between younger and older people in collectivist and individualist cultures.Additionally, they furnished recommendations for tailoring personalization strategies based on cultural factors such as age and gender.PBL (points, badges, and leaderboards) is an interesting concept.According to some academics who looked into elderly people's gaming experiences, the majority of the currently available gamification systems are designed for younger audiences, have little relevance to parents, and cause them great stress when used [48].Additionally, Denden et al. [49] have demonstrated that as children get older, their preferences, play motives, and play styles change from being performance-focused to being resolution, choice, and enjoyment-focused.They advise designers to concentrate more on objectives that favor setting and emotional connection as people age.

Gamification elements in banking
Recent advancements in the use of gamification systems have enhanced the requirement for diversified understanding between many people in numerous domains [15,50,51] Gamification is frequently linked to game elements, business applications, information systems, and entertaining tools.Funfocused gamification endeavors to facilitate customer selfrealization and foster sustained utilization of the system over the long term.Awarding superior customers with badges or medals as a symbol of their accomplishments and a sign that they belong to a particular group.
6 Score Note the results for each section's score.This is done in an effort to keep consumers motivated to raise their total grade.7 [59][60][61]

Task
Task completion menu consisting of banking tasks.
8 Feedback Feedback items can be entered for both the user and the bank, making it possible to distinguish between one customer's wants and another.9 Leaderboard Performance by every consumer to foster a spirit of cooperation and competitiveness.10 Hunt for offers Users search for deals that can be redeemed for discounts or other forms of promotions.

[50] Timer
There is a timer for each game.12 [50] Levevl Delivering personalised presents to consumers at different levels based on their performance 13 [56] Share Users should be encouraged to share.This encourages the sharing of information in society.to boost user motivation.14 [15,52,[60][61][62] Social Interactive Possibility for customers to socialize with others through technology.15 [60,62] Penalty Restricting service provision in response to negative customer behavior in the banking system.16 [15,60,62] Avatar Use of a customer icon (avatar) as a thumbnail in a private profile is conceivable.17 [15,62] Lottery Conduct customer lotteries and award the winner.
18 [15,60,62] Virtual reward Giving customers virtual presents with the option to collect them as collections and using virtual currency with uses in the financial system.
19 [52,62] Epic meaning Fostering in customers a sense of belonging to something greater than themselves and working for a greater good, even if it may not necessarily directly benefit them.20 [15,62] Informing Provide them with more banking benefits service.21 [15,62,63] Random reward Giving consumers impromptu and unusual presents The review found 21 different gamification elements (see Table 1).Social networks and social interactions inside a system, for example, may fall under more than one part; however, for the sake of this review, each aspect is categorized under a different heading.
Table 1 categorizes various gamification elements used in banking to engage customers, such as announcements for user interaction, points based on performance that can be exchanged, rewards for completing challenges, rankings, badges for achievements, scores to motivate improvement, task completion menus, feedback mechanisms, leaderboards, promotional hunts, timed, games, levels offering personalized rewards, sharing features, social interaction opportunities, penalties for negative behaviors, avatars for personalization, lotteries, virtual rewards, epic meanings to foster belonging, informational services, and random rewards.Each element is designed to enhance user experience, foster community, and incentivize participation in the banking system.

Customer engagement
Extensive scrutiny of the concept of engagement has been conducted across multiple disciplines, including marketing, management, information systems, and information technology management [64].Information systems and information technology experts view user engagement as a part of a more significant sedentary flow or condition that denotes a person's level of involvement and enjoyment in an activity [65].
Engaging customers strengthens long-term ties with them, which boosts corporate competitiveness [66].Bowden states participation is a "psychological process" involving emotional and cognitive components [67].According to Vivek et al. [68], customer engagement is the level of involvement and interaction with a company's goods and initiatives that customers or organizations initiate.Businesses create environments to enhance user involvement to significantly increase their chances of success Field [69].
Previous studies have looked into how customers assess the features of Islamic banks; Ahmad et al. [69] found that customers have a more favorable opinion of the quality of service in Islamic banks than in conventional banks.Customer preferences were consistently reflected in the aspects of tangibility and empathy that were used to evaluate the quality of the services.According to Estiri et al. [70], cost, product offerings, and other service qualities and value propositions are important factors in customer satisfaction but do not always translate into loyalty.Echchabi and Nafiu Olaniyi [71] observed that, among other things, branch location and service quality influenced Malaysian clients' preferences for Islamic banking.In a similar vein, Amin et al. [72] connected client loyalty with the bank image.The role of digital technology in transformation of regional models of household financial behavior analyzed by Zhavoronok et al. [73] found that innovation processes play an important role in national economies, which is largely due to the process of digitalization of financial relations in all developed countries.

METHOD
This study used five stages, as illustrated in Figure 1.These stages include (1) Formulation of a Scale, (2) Data collection, (3) Data analysis, (4) Test validity and reliability, (5) Development of an instrument rating scale.A detailed description of each stage is given below.

Formulation of the scale
The customer engagement survey employed a Likert scale with the following response options: 1: Strongly Disagree, 2: Disagree, 3: Uncertain or Neutral, 4: Agree, and 5: Completely Agree.The Rasch model converts the ordinal item scores from the Likert rating scale into an interval scale called "logarithmic unit odds" (logit).In practice, most logit values fall within the 5.00 and 5.00 [74].Item and person fitness statistics reveal the degree to which the results acquired are appropriate, dependable, and consistent with the underlying measure and offer details about the measurement's accuracy.After formulating a Likert scale, the next step is developing an instrument in the form of a questionnaire that is tailored to the needs of customer involvement with a gamification approach in the banking sector consisting of 48 statements.The instrument uses a Likert scale (1-5) which produces ordinal data.The assessment criteria for analyzing the results are derived from the standards established by Fisher Jr [75].Table 2 below provides a guide, as outlined by Fisher Jr., for assessing the quality of the instrument.

Data collection
Data collected through measurement plays a crucial role in establishing the connection between empirical observations and quantitative mathematical representations.In this study, primary data from user questionnaires in the field of mobile banking were utilized, encompassing multiple banks across Indonesia.The research sample comprised 451 individuals who were active users of mobile banking applications.The data was collected among mobile banking users in Yogyakarta, Central Java, and North Sumatra, Indonesia.Data Collection Process: This study employed a comprehensive data collection process aimed at capturing a wide array of perceptions and interactions users have with gamification elements in banking apps.The research targeted a demographically diverse group of participants to ensure the inclusivity of the data.Participants were recruited through online forums, social media platforms, and email invitations sent to bank customers, with the selection criteria focusing on individuals who have used mobile banking applications within the last six months.
The survey instrument was carefully designed to gather quantitative and qualitative data on users' experiences with gamification.Questions were developed to assess the frequency of app usage, specific gamification features encountered, and the perceived impact of these features on engagement levels.To enhance the reliability and validity of the survey, items were pre-tested in a small pilot study, leading to refinements based on feedback.
The Rasch model was chosen for its robust capability to transform ordinal survey responses into interval measures, which are essential for accurate comparisons and assessments of user engagement.This model allows for the evaluation of both item difficulty (in this context, the engagement level required to interact with a gamification feature) and person ability (the user's likelihood to engage with the app due to gamification), providing a nuanced analysis of gamification's impact The application process involved several key steps: (1) Data Preparation: Responses were coded and entered into the Rasch analysis software, ensuring that the data met the model's assumptions.
(2) Model Fitting: The Rasch model was applied to the dataset, with iterative adjustments made based on fit statistics and item-person interaction analyses.This process helped identify and address any anomalies or misfitting items, ensuring the model's accuracy.
(3) Interpretation of Results: The final output included person measures (indicating levels of user engagement) and item measures (reflecting the engagement potential of each gamification feature).These metrics were then interpreted in relation to our research questions, providing insights into how different gamification elements influence user engagement in mobile banking apps.

Data analysis
Using Rasch analysis, the collected questionnaire data was assessed, enabling the conversion of ordinal data into interval data.According to Bond and Fox (2015), the Rasch model is widely recognized as the most suitable approach for fundamental analysis in human sciences, particularly when questionnaires are employed, resulting in ordinal measurements [26].According to Cavanagh and Waugh [76], the probabilistic nature of the Rasch model enables accurate prediction of individuals' responses to all items that conform to the measurement model.This prediction can be achieved by utilizing the person parameter as the measure of individuals' abilities and the item parameter as the measure of item difficulty, both on the same scale.
This analysis uses Rach's mathematical model to estimate an individual's proficiency level based on their answer patterns to instrument items.The mathematical formulation of the Rasch One-Parameter Logistics (1PL) Model is as follows [77]: In the above formula, P(Xi=1) is the probability that respondent i answered correctly in item i. c is a constant parameter that describes the probability of the respondent answering correctly on a very easy item.b is the item difficulty parameter which describes the level of difficulty of the i-th item.d is the difference between the skill level of respondent i and the level of difficulty of item i.

Test the validity and reliability
Upon analyzing the data, the validity and reliability tests of the developed instruments that were conducted affirm their accuracy in measuring customer engagement within the banking sector, specifically utilizing a gamification approach.

Development of an instrument rating scale
Once the instrument has been established as valid and reliable, the next step is to create a final scale that effectively assesses individual skills in customer engagement within the banking sector using a gamification approach.In order to measure the instrument, this study employs a Rasch analysis methodology, which represents a contemporary test analysis technique capable of surpassing the limitations of classical test theory.By utilizing the Rasch model, it becomes possible to ascertain the precision between the developed tests and the individuals being evaluated.This model further facilitates the examination of whether the developed test accommodates various levels of competency among the individuals being assessed through its item-person map feature.This map integrates two crucial pieces of information: the arrangement of items according to their difficulty levels and the arrangement of individuals based on their measured abilities.More specifically, the item maps offer valuable insights into the difficulty levels of the items within the assessment, identifying the most challenging and easiest items in the test.

Formulation of the scale
Of the 26 dimensions of aspects related to customer involvement with a gamification approach in the banking sector using a Likert scale at intervals of 1-5 it shows that the scale most chosen by respondents is 5, which means that respondents' ratings tend to give a very agreeing assessment of items that are given.The following Figure 2 presents the results of responses from 451 respondents to the Likert scale items used in this study.

Figure 2. Respondent items
Based on the analysis of the formulation of the scale from 451 respondents showed that item 3 (undecided/neutral) and item 5 (strongly agree) respectively 22% and 21%.In items 1 and 2 regarding customer involvement, respondents are more likely to choose scale 3 (undecided/neutral).This indicates that customers already understand games but are still confused about using them when set in mobile banking, so this can provide benefits for banks and users.Meaningful customer engagement can be achieved through easy and fun and interactive game features.On the other hand, customer involvement in mobile banking applications can strengthen relationships with customers [39].

Data collection
In this research, a quantitative methodology is adopted, utilizing non-experimental designs to explore diverse phenomena.The collection of measurement data is of paramount importance as it enables the identification of relationships between empirical observations and quantitative mathematical expressions.To ensure the acquisition of primary data, an online questionnaire was meticulously designed and subsequently distributed via the user-friendly Google Forms platform.The questionnaire was thoughtfully conducted in the Indonesian language (Bahasa Indonesia) to specifically cater to the targeted participants from Indonesia.Table 3 provides an overview of the respondents' demographic profiles, encompassing factors such as gender, age, education, occupation, frequency of mobile banking usage (on a weekly basis), and weekly hours spent on the internet, social networks, video games, and similar activities.The sample exhibits a nearly equal distribution of male and female respondents, with the majority falling within the age range of 10 to 24 years old.[78] propose an extensive set of indicators based on the Rasch model that can be applied to both individuals and items.These indicators encompass a range of psychometric properties, including mean square outfit (MNSQ), Zstandardized outfit (ZSTD), and point measurement correlation (PT-Measure Corr.).The evaluation process of the model commences by examining the MNSQ outfit value, which ideally falls within the 0.5 to 1.5 range, indicating a good fit for measurement.If the MNSQ value deviates from this range, attention is then shifted to the corresponding outfit ZSTD value, which should ideally be between -1.9 and 1.9, indicating reasonable data predictability.Internal reliability consistency, reflecting the average correlation among the items in the instrument, is assessed using Cronbach's α coefficient.A value approaching 1 signifies a favorable level of internal measurement consistency.
The study data was efficiently arranged using Microsoft Excel software and examined utilizing Winstep software version 3.73.After establishing suitable measurement intervals and satisfying validity and reliability criteria based on the Rasch model, the data underwent binary logistic regression analysis employing SPSS software.This particular methodology was selected to address the dichotomous nature of the questions answered by the participants.

Instrument reliability test
Table 4 displays the statistical overview of the instrument, encompassing person reliability and item reliability.The consistent findings confirm the accuracy and reliability of the measurements.The analysis yields two separate outputs, specifically person-output and item-output.The person table evaluates the statistical alignment of respondents with the data, whereas the item table assesses the compatibility of the instrument's items.As per Linacre [78], the mean value represents the signal-to-noise ratio within the data, with the mean coefficient indicating the square root of the ratio between the true person variance and the error variance present in the data.Boone et al. [79] stated that the index of the mean person and the mean item is a very important addition to the evaluation of the function of the measuring instrument.Based on the information presented in Table 4, the average person measure is recorded as 1.92.An average value above 0 indicates that the individuals or respondents possess higher abilities relative to the difficulty level of the items within the instrument.The individual reliability index is 0.94, while for items, it reaches 0.95.The responses from the subjects demonstrate strong consistency, and the overall reliability of the items is considered excellent, following the guidelines provided in standard Table 1 for person and item dependability.
The optimal values for INFIT and OUTFIT MNSQ approach 1, signifying a good fit, while the desirable values for INFIT and OUTFIT ZSTD are those close to 0, indicating reasonable predictability.After conducting a comprehensive analysis of the person and item tables, it is evident that the mean values of INFIT and OUTFIT MNSQ, as well as INFIT and OUTFIT ZSTD, align with the desired range.A higher separation value is preferred because it identifies a broader spectrum of subjects (those who are able versus unable) and item groups (difficult versus easy).Table 1 indicates a Separation Person value of 4.03, indicating that the instrument items are highly sensitive (very good) in covering the entire range of respondent abilities.The Separation Item value is 5.67, suggesting considerable variability among respondents and excellent detection of instrument item performance.The findings of the person and item validity tests, along with the MNSQ and PT-measure values, are depicted in Table 5.All items have exceeded the critical values for MNSQ and PT-Measure, thus affirming their validity.Linacre [78] suggests that the statistical output's sensitivity to outliers facilitates identifying and resolving any issues related to the appropriateness or compatibility between the data and the model.According to Boone et al. [79] Outfit Means-square, Outfit z-standard and Point Measure correlation values are the criteria used to determine item suitability.It is advisable to get the item repaired or replaced if it does not satisfy the requirements.The Outfit Mean Square (MNSQ) value received is 0.5 MNSQ 1.5, which is the criterion used to determine whether an item is appropriate.Table 5 displays item statistics in summary.Analysis of the discussion of results and Table 5. Summary of Statistical Items can provide insight into the psychometric properties and performance of gamified customer engagement instruments in the banking sector.The table shows that the INFIT Mean square (MNSQ) value for each instrument item lies between 0.5-1.5 as shown in column 6.This shows that the items are all appropriate and do not need to be replaced, so it can be concluded that there is no there is an obvious disturbance or impact to the model while performing the statistical analysis.

Development of an instrument rating scale
The instrument's rating scale will be evaluated in the following stage.analysis on the results of this measurement is used to verify whether the choice rating used confuses the response or not.To see whether the rating scale function is running well, it can be seen through the measurement results in Table 6.
Based on the information presented in Table 6, the average person measure is recorded as 1.92.An average value above 0 indicates that the individuals or respondents possess higher abilities relative to the difficulty level of the items within the instrument.The individual reliability index is 0.94, while for items, it reaches 0.95.The responses from the subjects demonstrate strong consistency, and the overall reliability of the items is considered excellent, following the guidelines provided in standard Table 1 for person and item dependability.
The optimal values for INFIT and OUTFIT MNSQ are those that approach 1, signifying a good fit, while the desirable values for INFIT and OUTFIT ZSTD are those close to 0, indicating reasonable predictability.After conducting a comprehensive analysis of the person and item tables, it is evident that the mean values of INFIT and OUTFIT MNSQ, as well as INFIT and OUTFIT ZSTD, align with the desired range.A higher separation value is preferred because it allows for the identification of a broader spectrum of subjects (those who are able versus unable) and item groups (difficult versus easy).Table 1 indicates a Separation Person value of 4.03, indicating that the instrument items are highly sensitive (very good) in covering the entire range of respondent abilities.The Separation Item value is 5.67, suggesting considerable variability among respondents and excellent detection of instrument item performance.The findings of the person and item validity tests, along with the MNSQ and PT-Measure values, are depicted in Table 6.All items have exceeded the critical values for MNSQ and PT-Measure, thus affirming their validity.Linacre (2012) suggests that the statistical output's sensitivity to outliers facilitates the identification and resolution of any issues related to the appropriateness or compatibility between the data and the model [79].According to Boone, et al. (2014), Outfit Means-square, Outfit z-standard and Point Measure correlation values are the criteria used to see the level of item suitability [78].The Outfit Mean Square (MNSQ) value received is 0.5 MNSQ 1.5, and this number is the criterion used to determine whether an item is appropriate.Table 5 displays item statistics in summary.
Analysis of the discussion of results and Table 5. Summary of Statistical Items can provide insight into the psychometric properties and performance of gamified customer engagement instruments in the banking sector.The table shows that the INFIT Mean square (MNSQ) value for each instrument item lies between 0.5-1.5 as shown in column 6.This shows that the items are all appropriate and do not need to be replaced, so it can be concluded that there is no there is an obvious disturbance or impact to the model while performing the statistical analysis

Development of an instrument rating scale
The instrument's rating scale will be evaluated in the following stage.analysis on the results of this measurement is used to verify whether the choice rating used confuses the response or not.To see whether the rating scale function is running well, it can be seen through the measurement results in Table 6.
Table 6 presents the measurement outcomes for each category in the rating scale.Category 1 corresponds to a score of 1 and consists of 2037 observations, accounting for 9% of the sample.The average measure for this category is -2.26, with an expected measure of -2.59 The threshold measure is 2.47.These findings provide a summary of the distribution and measurement characteristics for each category of the rating scale employed in the study.
The scale categories in Table 5 column 5 in the Category measure section show that each category is going well because on a scale of 1-5 the Category measure value increases proportionally and sequentially from -3.52 to 3.72.The existence of this monotonic increase indicates that the measurement has been going well.It is apparent that responders can discriminate between the answer options from very improper to very appropriate because there are no similar values in these five selections.Not all studies directly get good rating measurement results like this.Several studies related to Rasch analysis show that there are disproportionate conditions or unbalanced conditions, sometimes even the measurement of the rating scale is not shown as in research [80].This can also be seen based on the probability of a person's ability as shown in Figure 3. Figure 3 demonstrates that if a person's ability is low, there is a high likelihood that they will respond with a rating of 1, while their ability is high, there is a higher likelihood that they would respond with a rating of 5. Figure 3 below depicts an ideal Rasch analysis model that can be used to compare the instrument's applicability as a whole.
Within a section of Figure 4, the scores of the instrument items are compared to the difficulty level measurement of the items, denoted by the blue line.The figure depicts the pattern of tested instrument item scores, which generally adheres to the ideal model of Rasch analysis, represented by the red line.While not flawless, the chart demonstrates that the instrument closely approximates the ideal Rasch pattern.This alignment between the instrument item scores and the Rasch ideal model is an encouraging discovery, indicating a reasonable level of agreement between the instrument and the targeted construct or variable.The closer the alignment between the instrument item scores and the Rasch ideal model, the stronger the measurement properties of the instrument.

Discussion
Comparing these results with previous research conducted by Lailiyah et al. [81], in analyzing self-assessment instrument items it shows that there are response patterns that are too far from the ideal model line curve.Therefore, additional analysis must be done, namely by gradually deleting things from the analysis that do not fit until the findings show that all of the items are consistent with the model.In Kadaryanto's study, which focuses on the design of teacher professional development programs utilizing the Rasch model, the analysis indicates a lack of alignment between the Item Characteristic Curve (ICC) and the desired pattern represented by the Red line (ideal model).In simpler terms, there are outliers observed in the distribution of respondents' answers to the survey questions.Ideally, the measured values should closely resemble the ideal model line.This suggests that some respondents may have provided careless or thoughtless responses to the survey questions [82].Zansen's research utilized the Rasch measure to examine the fairness of the listening part across gender subgroups of a nationwide highstakes computerized English examination.The NUDIF research' findings reveal that 12 items have gender-based DIF.The difference in ICC is not always discernible when pupils are separated into high-and low-achieving boys and girls [82].
The validity of the instruments used in further research can be assessed by examining the log values of each item or person.Deviations in logit values indicate a potential problem with a particular individual or item, indicating a need for revision or replacement.Ideally, the logit or size of each person and item should be closer to zero, indicating a better fit to the construct being measured.Another instrument validity indicator is the standard error of measurement (S.E.Measurement).A low standard error of measurement (less than 0.5) indicates high accuracy for an item or person, meaning that the measurements are reliable and precise.A standard measurement error between 0.5 and 1 is considered acceptable or moderately accurate, while a value exceeding 1.0 indicates poor accuracy for the person or item being measured.In addition to the standard error of measurement, the Outfit Mean Square (OUTFIT MNSQ) value provides insight into the instrument's validity.The ideal range for the OUTFIT MNSQ value is typically between 0.5 and 1.5.Values in this range indicate a good fit between the instrument items and the underlying construct.Departing from this range could suggest non-compliance or an issue with measurement that requires attention.The OUTFIT ZSTD metric is an additional means of evaluating the instrument's reliability.Values falling within the -2 to +2 range are deemed favorable, signifying a satisfactory alignment between the observed and expected responses derived from the Rasch model.Values outside this range may indicate potential problems with the measurement pattern or responses that deviate significantly from the expected model.Finally, Point Size Correlation is crucial in assessing the instrument's validity.Ideally, the Point Size Correlation should fall within the range of 0.4 to 0.85 to demonstrate sound validity.This correlation coefficient represents the degree of connection between the item's score on the instrument and the underlying construct being measured.Stronger correlations indicate more robust associations and offer supporting evidence of the instrument's proficiency in effectively measuring the intended construct [81].
By considering these validity indicators, researchers can assess the quality and feasibility of the instruments used in the study.If the instrument exhibits favorable log values, low standard error of measurement, suitable clothing mean squared values, clothing ZSTD values within the desired range, and satisfactory point size correlation, it indicates good validity.Conversely, if these indicators fall outside the recommended ranges, further investigation and potential instrument modification may be required to increase their validity.
Overall, the discussion emphasizes the significance of assessing the reliability of the research tools by considering numerous elements and criteria.Validity assessment ensures that the instrument accurately measures the intended construct and provides reliable and meaningful data for research purposes.
This study's limitations include its reliance on quantitative methods without thoroughly investigating respondents' diverse backgrounds.It leaves a gap in empirical insights for banks targeting specific community segments for their marketing efforts.

CONCLUSION
In conclusion, this study highlights the overwhelmingly positive assessments provided by respondents regarding the gamification approach in the banking sector, as evidenced by the frequent selection of the highest rating on the Likert scale.This positive inclination indicates a favorable perception and active engagement with the gamified elements implemented in mobile banking applications.The Rasch analysis further confirms that respondents possess a strong understanding of the instrument items, rendering the instrument suitable without needing replacement.The research successfully captures and evaluates multiple aspects of customer engagement, thereby validating the instrument's reliability and validity.The results highlight the potential effectiveness and attractiveness of gamified elements in boosting customer involvement and improving user experiences, ultimately resulting in higher customer satisfaction.
However, it is crucial to acknowledge that these results are based on a specific sample of 451 participants.Therefore, further research is necessary to examine the generalizability of these findings to a broader population and diverse banking contexts.Qualitative data and follow-up studies can provide deeper insights into the specific aspects of gamification that resonate most strongly with customers.Bank managers can derive significant benefits from the implications of this research by increasing awareness, aligning with customer preferences, and enhancing human resources, services, and infrastructure within their organizations.Proactively anticipating customer behavior through excellent service management, product adjustments, and soliciting feedback are critical considerations for successful implementation.
This study emphasizes how crucial it is for bank executives and governmental organizations, such as the OJK, to increase public knowledge of and preference for banking by fostering extensive stakeholder collaboration.It also emphasizes how important it is for Islamic banks to improve infrastructure, streamline services, and increase the quantity and caliber of their human resources.It also emphasizes the need for banks to aggressively seek community feedback, adjust product offers, and improve service management in response to shifts in customer behavior.
Future research endeavors should explore bank customer profiles, identify target market segments, and develop tailored marketing strategies.Combining quantitative and qualitative analyses while minimizing biases in measurement tools can provide more comprehensive results.This study offers valuable insights into measuring customer engagement with gamification in mobile banking applications.The instrument exhibits good comprehension, reliability, and validity, establishing its suitability as an effective data-collection tool.The implications of this research can serve as a guiding resource for bank managers in improving their strategies to address customers' evolving needs within the digital banking landscape.

Table 1 .
Gamification elements studied in researchPointPoints that can be traded in for money or specifics.Calculating points for customers based on their success or performance in the banking system.

Table 3 .
The profile of research respondents

Table 4 .
Statistical results of person and item reliability instruments

Table 5 .
Summary of item statistics

Table 6 .
Rating scale category measurement results . The Infit MNSQ is 1.66, and the Outfit MNSQ is 1.81.No threshold measures are available for this category.Category 2 represents a score of 2 and encompasses 1825 observations, making up 8% of the sample.The average measure is -1.26, while the expected measure is -0.96.The Infit MNSQ is 0.76, and the Outfit MNSQ is 0.77.The threshold measure for this category is -2.20.Category 3, with a score of 3, comprises 4768 observations, accounting for 22% of the sample.Both the average measure and expected measure for this category are 0.29.The Infit MNSQ is 0.75, and the Outfit MNSQ is 0.89.The threshold measure is -1.30.Category 4 represents a score of 4 and includes 4534 observations, making up 21% of the sample.The average measure is 1.74, while the expected measure is 1.70.The Infit MNSQ is 0.77, and the Outfit MNSQ is 0.67.The threshold measure is 1.02.Lastly, Category 5 corresponds to the highest score of 5, with 8484 observations, accounting for 39% of the sample.Both the average measure and expected measure for this category are 3.55.The Infit MNSQ is 1.28, and the Outfit MNSQ is 1.25.