Title: SARS-CoV-2 Infection Among Healthcare Workers in Tijuana, Mexico: A cross-sectional study

Background: Healthcare workers (HCW) are a high-risk group for COVID-19. The aim of this study is to estimate the risk of acquiring COVID-19 among HCW from Mexican Institute of Social Security in Tijuana, Mexico. 
Methods: A cross-sectional study from Epidemiologic Surveillance Online Notification System database was conducted, including entries from Tijuana, starting March 11th until May 1st 2020. Multiple imputation was performed for the SARS-CoV-2 RT-PCR result variable where data was missing. Chi-squared statistic with Yates correction and OR were calculated to estimate the risk of HCW compared to the general population (GP). 
Results: From a total of 10,216 entries, only 6,256 patients were included for analysis. Being a HCW was significantly associated with a higher risk of acquiring COVID-19, OR=1.730 (CI 95% 1.459-2.050). Nurses had double the risk (OR=2.339; CI 95% 1.804-3.032) than the GP. The cluster of physicians only had an additional risk for COVID-19 of 2.8% (OR=1.828; CI 95% 0.766-1.380). Resident physicians doubled the risk of the GP (OR=2.166; CI 95% 0.933-5.025). Meanwhile, interns had a possible protecting factor (OR=0.253; CI 95% 0.085-0.758). Among medical specialties, emergency medicine has the highest risk (OR=4.071; CI 95% 1.090-15.208), followed by anesthesiologists (OR=2.806; CI 95% 0.544-14.466). 
Conclusion: HCW have up to 73% more risk of acquiring COVID-19 than the GP in Tijuana, Mexico. Nurses were the group at highest risk of all HCW, as a result of prolonged and close contact with patients. Emergency medicine and anesthesiology were the medical specialties most at risk because they frequently perform aerosol-generating procedures.

As a result of the uncertainty regarding disease transmission, severity, and mortality, access to some resources, 23 such as face masks, sanitizers, and thermometers were soon scarce. At present, actions are being enforced to 24 minimize the risks in the workplace with measures such as filtering at entry points, sanitizing hospitals, and 25 continually providing personal protecting equipment (PPE) to the medical staff. Despite this, many HCWs in 26 Mexico still feel vulnerable and question whether the PPE with which they are provided is sufficient. 9 In other 27 countries, HCW screening has been proposed, as they are considered amplifiers of nosocomial and community 28 transmission. 7

30
Regardless, the measures implemented have not been sufficient to contain the escalating number of cases.

31
COVID-19 disease outbreaks have been reported all across Mexico, and several hospitals have notified of 32 outbreaks internal to the hospital involving HCWs. 9 The increase in the number of cases among the general 33 population (GP) has also been reflected in HCWs,2,15,16 with sustained rises of confirmed cases. On April 24, 34 2020, 1,934 HCWs had a positive RT-PCR result for SARS-CoV-2, which represented 15% of the total (12,872) 35 confirmed cases up to that day. The affected HCWs were distributed as follows: 47% physicians, 35% nurses, the workforce due to infection. 17 IJMS Thus, the aim of this study was to estimate the effect size of being a HCW and acquiring COVID-19 at the 1 Mexican Institute of Social Security (IMSS) in Tijuana, a US-border city in Mexico. As secondary analyses, risk 2 estimates were stratified by HCW categories, by physician hierarchies, and by medical specialties. IJMS

MATERIALS AND METHODS.
1 2 Study design 3 A cross-sectional database study was conducted using data from the IMSS's Epidemiologic Surveillance Online 4 Notification System (SINOLAVE). An internal network database that included the records of COVID-19 5 suspected cases reported from different IMSS centers in Mexico. As this was secondary research from an 6 institutional database, it was exempt from IRB review at IMSS. The data for the study was extracted on May 11, 2020 and it corresponded to the entries recorded from March 10 11, 2020 t to May 1, 2020. The data extraction criteria from SINOLAVE database were subset records from the 11 Baja California delegation, including healthcare units from "all regimes". Additional information about specific 12 occupations of patients identified as HCWs was manually obtained through social security number (SSN) from 13 electronic medical records before concealing subject identities for further analysis.

16
The SINOLAVE database consists of the following items: patient SSN, registry date, symptoms onset date, 17 occupation and employer, clinical history including presence or absence of signs and symptoms, personal 18 medical history (including chronic disease, tobacco smoking, alcohol consumption and pregnancy status, as 19 well as history of travel and contact with COVID-19 cases and/or animals), results from RT-PCR for SARS-CoV-20 2 from nasopharyngeal or oropharyngeal swabs or specimens from lower respiratory tract secretions, treatment, 21 and outcomes from primary and secondary healthcare systems.

24
The database was filtered to only include patients of all ages registered in Tijuana, Mexico, which corresponded 25 to those notified from primary care centers number 7, 18, 19, 27, 33, 34, 35 and 36, and secondary care centers 26 number 1 and 20. Individuals without complete personal and clinical history were excluded and duplicated or 27 triplicated entries were eliminated, keeping the first chronological record or the one that fulfilled severe acute 28 respiratory infection (SARI) criteria if it was registered at the same healthcare level. If duplicates were reported 29 by different healthcare levels, the entry kept was either from the highest healthcare level included a reported 30 laboratory test result. Data was recorded in a way that the identity of the human subjects could not be 31 ascertained.

34
Patients whose registered occupation was "physician", "nurse", "laboratory staff", "dentist" or "other HCW", along 35 with being enrolled as "IMSS employee" were defined as HCWs. Other IMSS employees with entries of different occupations from the ones previously mentioned, were reclassified as "other HCW". The remainder of patients who did not satisfied the above-mentioned criteria were defined as GP. and contact with suspect cases were considered predictors of missingness and defined as auxiliary variables 16 for imputation before the analysis was conducted.

18
The mode value from the multiple imputation was assigned to registries with missing information, obtaining the 19 following two sets of data: the complete-case analysis, excluding participants without a RT-PCR result (Analysis 20 1) and an alternative data set incorporating multiple imputation data including all of the patients (Analysis 2).

22
For the analysis of the relationship between HCW and COVID-19 case status, crude prevalence odds ratios 23 (POR) were calculated and the χ 2 test was used in the bivariate analysis, in addition to Yates correction. The

24
Mantel-Haenszel test was used to control for confounding, stratifying by age, gender, and history of chronic 25 disease, as no other demographic data was included in the database. Statistical analysis for each set of data 26 was conducted using IBM SPSS Statistics (Version 25) and STATA 15. Statistical significance was considered 27 as a P-value < 0.05.

29
An alternative statistical analysis using Rubin's rules for pooling multiple imputation results and binomial logistic 30 regression to estimate the effect size of being a HCW and acquiring COVID-19 is included in Supplementary files 1-5. IJMS

RESULTS.
From a total of 10,216 entries in the SINOLAVE registry, data from 6,256 patients was analyzed after eliminating 3 3,960 cases that failed to meet the inclusion criteria (3,858 were records from outside of Tijuana City, 72 were 4 repeated, and 30 had missing data, see Figure 1). Only 897 (14.33%) patients from the 6,256 included had at 5 least one RT-PCR test for SARS-CoV-2, thus it was possible to classify them as a COVID-19 case or a non-6 case for Analysis 1. On the other hand, multiple imputation was performed on data from 5,359 (85.66%) subjects 7 to complete Analysis 2, which included all the patients involved in this study.

9
Mean age for Analysis 1 was 45 years (SD 13), with a minimum of 0 to a maximum of 88 years of age ( Table  1). Analysis 2 showcased a mean age of 39 years (SD 19), with an age range of 0 to 97 years. The most 11 represented age group was 40 to 59 years (47.05%) in Analysis 1, and for Analysis 2 it was 16 to 39 years 12 (52.40%). There were slightly more males than females included in both analyses, with 493 (54.96%) vs. 404

22
Of all HCWs included (

29
From a total of 173 physicians ( Table 3) it was possible to identify the area of specialty or job position of only 30 126 subjects (72.8%) through a hospital records search. In both sets of analyses, the specialty with the largest 31 representation was internal medicine. However, subtracting resident physicians, that respectively account for 32 30.51% and 24.52% in Analyses 1 and 2, from their respective specialties showcased that interns were the 33 largest subset among the doctors' subgroup.

9
Nurses were the HCW subgroup with the highest odds of acquiring COVID-19 (Figure 2), with a POR = 2.339 (CI 95% 1.804, 3.032) compared to the GP in Analysis 2, and POR = 1.210 (CI 95% 0.640, 1.628) in Analysis 11 1. In addition, other HCWs had a POR = 1.765 (CI 95% 1.336, 2.330) in Analysis 2, whereas in Analysis 1 this 12 was not statistically significant (OR = 0.689; CI 95% 0.511, 0.930). On the other hand, physicians showcased a 13 protective factor in Analysis 1 (POR = 0.557; CI 95% 0.365, 0.851) and a small excess in effect size compared 14 to the GP in Analysis 2 (POR = 1.028; CI 95% 0.766, 1.380). No change was observed after stratifying by 15 gender, age group and history of chronic disease. It was not possible to estimate the association and individual 16 risk of dentists and laboratory staff for COVID-19 given the low number of subjects in these subgroups.

18
Within the different physician hierarchies (Figure 3), it was found that interns had a POR = 0.345 (CI 95% 0.099,

23
Adjusting by gender, age group and history of chronic disease showed no difference.

25
Further analysis was conducted to estimate the risk attached to each medical specialty included in this study 26 compared to that of the cluster of physicians (Figure 4)

2
In this study, HCWs had 73% higher odds of acquiring COVID-19 than the GP. A disparity in the number of 3 COVID-19 confirmatory tests was observed, since the HCW cluster was tested at least three times more 4 (36.01%) than the GP (11.59%). Therefore, multiple imputation was performed to reduce the bias generated by 5 the lack of confirmatory test results. Comparing between HCW categories, nurses were identified as the group 6 with highest likelihood of acquiring COVID-19, with nearly double the odds of the GP. Conversely, the physician 7 subgroup showcased a statistically significant protective factor in one of the analyses. However, using Analysis 8 2, it demonstrated only an additional 2.8% increase in odds from the GP, without statistical significance.

9
Analyzing the physicians cluster by hierarchy, the group with the largest effect size estimate was resident 10 physicians, with approximately 50% to 60% higher odds than GP in both analyses, but neither were statistically 11 significant. On the contrary, interns showcased a potential protective factor compared to the GP. Finally, 12 emergency medicine held the largest effect size among the medical specialties included in this study, with four-13 to eight-fold increase in odds compared to the all the other medical specialties, and although statistically 14 significant, wide confidence intervals were estimated. Anesthesiology followed as the second medical specialty 15 with the highest likelihood of infection, by nearly double the estimate, but also with wide confidence intervals. In 16 contrast, internal medicine posed a possible protective factor, with a close to 30% decreased likelihood of contracting COVID-19 than the rest of physicians; however, this finding was not statistically significant in either   interventions carried out by nurses and having more frequent and closer contact with patients for extended IJMS periods of time compared to, for example, physicians. 25,26 Therefore, they are subjected to a greater exposure 1 than the rest of the healthcare workforce. Additionally, it should be considered that nurses are the largest group 2 of all the HCWs in this study population. Because of this, they may also have higher probabilities of coming into 3 contact with infected colleagues in the workplace. On the other hand, physicians were subjected to a smaller 4 effect size, and even appeared to have a degree of protection in Analysis 1. This could be explained considering 5 the diversity within medical specialties, including the heterogeneity of procedures they perform and the PPE

38
The limitations of this study are inherent to the design itself, considering that the data used was not specifically 39 generated with the intention of answering our research question. Errors in categorization could have been made IJMS due to not having complete information on the occupation from all participants. Likewise, lack of information 1 about HCW type of contact with patients, working hours, and frequency of exposure did not allow for further 2 analysis to meaningfully compare different patterns between HCW categories. These results are based on data 3 from a public healthcare system in one city in northern Mexico and thus is not necessarily internationally 4 generalizable. It should be noted that POR is not an estimation of risk and therefore these results are to be 5 cautiously interpreted, as they could overestimate the effect size if an approximation to risk is to be inferred.

6
Multiple imputation helped avoid further reduction of our study population and mitigated the bias from missing 7 data. Nevertheless, using this method for analysis showcased some opposing results that could be explained 8 by a number of factors. Primarily, multiple imputation using the MAR assumption implies a random distribution 9 of attributes under the premise that missing data depends on the observed data and not on the values of the 10 missing data, whereas RT-PCR results in Analysis 1 were obtained by testing individuals according to clinical 11 judgement and hospital policies and resources. As a result, characteristics such as the auxiliary variables used 12 for imputation contribute to predict missing data, but with limitations such as complete medical records and 13 individual hospital policies and procedures for testing were not included in the database. Therefore, the 14 distribution of cases could differ from actuality in both analyses. Likewise, results regarding medical specialties 15 should be interpreted cautiously, as the number of participants included was low, resulting in wide confidence 16 intervals. Finally, our study also takes into consideration the non-occupational risk to which HCWs are also 17 exposed to outside the workplace, for instance the analyses used the GP as referent.

24
On the other hand, interns, who were removed from COVID-19 high-risk areas, showcased a protective factor.

25
Moreover, among medical specialties included in this study, emergency medicine and anesthesiology have the 26 highest odds for contracting COVID-19, likely owing to the frequent execution of aerosol-generating procedures.

27
In addition, medical specialties assumed to be more exposed to confirmed COVID-19 cases, such as internal           Effect size for acquiring COVID-19 by medical specialty (GP as referent)