Purpose: RobotReviewer is a machine learning system for semi-automated assistance in risk of bias assessment. The tools’s performance in randomized controlled trials (RCTs) in the field of nursing remains unknown. We aimed therefore to evaluate the agreement in risk of bias assessment between RobotReviewer and human reviewers. Design: Evaluation study using a retrospective diagnostic design. Methods: We used RobotReviewer as the index test and human reviewers’ risk of bias assessment reported in Cochrane reviews as the reference test. A convenience sample of electronically available English-language full texts of RCTs included in Cochrane reviews with nurs* in the title were eligible for inclusion. In this context, we assessed random sequence generation, allocation concealment, and blinding (personnel or participants and assessors) corresponding to Cochrane risk of bias version 2011. Two independent research teams performed and double-checked data extraction and analysis. We calculated sensitivity, specificity, receiver operating characteristic (ROC) curve, the area under the ROC curve, predictive values, observed percentage of agreement, and Cohen’s kappa (including confidence intervals, if applicable). Findings: The selection process yielded 190 RCTs published between 1958 and 2016 in 23 Cochrane reviews published between 2000 and 2018. Missing assessments of risk of bias domains in Cochrane reviews or RobotReviewer yielded varying sample sizes per risk of bias domain. Sensitivity ranged from 0.44 to 0.88 and specificity from 0.48 to 0.95. Positive predictive value was highest for allocation concealment (0.79) and lowest for blinding assessors (0.25). Cohen’s kappa was moderate for randomization (0.52), allocation concealment (0.60), and for blinding of personnel/patients (0.43). Blinding of outcome assessors had only slight agreement (0.04). Conclusions: This is the first evaluation of risk of bias assessment by RobotReviewer in RCTs included in nursing-related Cochrane reviews. It yielded a moderate degree of agreement with human reviewers for randomization and allocation concealment, and an adequate sensitivity for detecting low risk of selection bias. Clinical Relevance: Based on our results, using the RobotReviewer for risk of bias assessment in RCTs can be supportive in some risk of bias domains. However, human reviewers should supervise the semi-automated assessment process.