Seeing Cleft Lip from a New Angle: Crowdsourcing to Determine Whether Scar Severity or Lip Angle Matters More to the General Public
Anne M. Sescleifer, BS1, Tamara A. Osborn, BA1, Jeffrey D. Rector, MA2, Alexander Y. Lin, MD, FACS1.
1Saint Louis University School of Medicine, St. Louis, MO, USA, 2University of California, Davis, Davis, CA, USA.

PURPOSE: Modern cleft lip surgery aims to balance the Cupid peaks to create a level, normal appearing Cupid bow. However, families tend to have many more questions about the degree of scarring. We hypothesized that the lip angle, which is often the surgeon's focus, would be less influential than scar severity in layperson ratings. We used a novel technique of simulating different lip angles and scar thicknesses and assessed the relative contributions of these two factors via online crowdsourcing.
METHODS: We received IRB approval for modifying patients' postoperative photos with Adobe Photoshop to create 5 levels of scarring (none, minimal, mild, moderate, severe) and 5 levels of lip angle (0, 5, 10, 15, and 20 degrees). Each child's resulting 25 composite images were presented in pairs to internet raters using Amazon Mechanical Turk (300 unique pairs per child). Users selected the simulated postoperative result they felt had the most normal appearance. Picture pairs were presented four times each, with order counterbalanced for a total of 1200 trials per child. The Bradley­ Terry (BT) model was used to fit the data with different predictors: scar level alone, angle level alone, and scar and angle (with and without interaction term). The Akaike Information Criterion (AIC) was used to identify the best-­fitting model. A smaller AIC value indicates a better model fit to the data, especially if the absolute difference is greater than 10. A Wald statistic tested whether each factor exerts equal influence on ratings.
RESULTS: Two children with primary cleft lip repair had their postoperative photos used: 22-month-old boy (17-month follow-up) and 6 year, 3-month-old girl (5 year, 9-month follow-up). Twelve-hundred crowdsourced pairwise ratings were collected for each patient (2,400 combined ratings). The AIC values showed that the best fit BT model incorporates Angle+Scar+Interaction. Within this model, the magnitude of the angle estimate is greater than the scar estimate, and this difference is statistically significant for both Child 1 and Child 2, respectively (Wald test z = 2.671 and 7.173, **P=0.003 and ***P<0.0001). There was good correlation between each child's photo ranks (Spearman rank = 0.957, ***P<0.001).

Crowdsourced Results for Lip Angle vs. Scar Severity
Child 1 (male)Child 2 (female)
Patient DemographicsAge at Photo22 months6 years, 3 months
Followup Time17 months5 years, 9 months
Bradley-Terry ModelScar Only2047.382181.82
(AIC, smaller is better fit)Angle Only1916.981808.28
Scar + Angle1731.231693.81
Scar + Angle + Interaction1678.981808.28
Wald TestWald statistic2.6717.173
Wald P-value0.003783.7x10-13
Spearman Correlation of Photo RanksVs. Other Child's Photo Ranks0.9570.957

CONCLUSION: Our novel simulation of different cleft lip primary surgery outcomes focused on two factors: angle of the cleft lip repair, and severity of the scar. Internet crowdsourcing shows that the postoperative lip angle has a significantly greater influence on ratings of normal appearance than does the severity of scarring. This is evidence against our hypothesis. Although patients may ask about scars more, clinically their perceptions of a cleft lip repair result are more likely influenced by the angle of the cleft lip repair. This insight can help guide the preoperative discussion, intraoperative decision-­making, and postoperative reassurance to the family.

