File size: 3,702 Bytes
abc3e66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "be6478b2",
   "metadata": {},
   "source": [
    "\n",
    "# Data Loading and Preparation\n",
    "\n",
    "In this section, we will load and prepare the data from two sources: `eligibilities.txt` and `usecase_1_.csv`.\n",
    "\n",
    "## Steps:\n",
    "\n",
    "1. **Import the pandas library**:\n",
    "\n",
    "2. **Load the `eligibilities.txt` data**:\n",
    "    - Use the `read_csv` method from pandas to load the data.\n",
    "    - Specify the separator as `|`.\n",
    "\n",
    "\n",
    "3. **Select the necessary columns**:\n",
    "    - We are interested in the `nct_id` and `criteria` columns.\n",
    "\n",
    "\n",
    "4. **Load the `usecase_1_.csv` data**:\n",
    "    - Use the `read_csv` method from pandas to load the data.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "20cfb7ee-0fd8-4b37-bae1-5ab98125ad10",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The column 'criteria' has been added to usecase_1_.csv and saved as usecase_1_merged.csv.\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "# Load the eligibilities.txt data\n",
    "eligibilities = pd.read_csv('../eligibilities.txt', sep='|')\n",
    "\n",
    "# Select the necessary columns\n",
    "eligibilities = eligibilities[['nct_id', 'criteria']]\n",
    "\n",
    "# Load the usecase_1_.csv data\n",
    "usecase = pd.read_csv('../usecase_1_.csv')\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0c015c5",
   "metadata": {
    "vscode": {
     "languageId": "markdown"
    }
   },
   "source": [
    "# Data Merging and Saving\n",
    "\n",
    "In this section, we will merge the datasets and save the merged data to a new CSV file.\n",
    "\n",
    "## Steps:\n",
    "\n",
    "1. **Rename the column in `usecase`**:\n",
    "    - Rename the column **'NCT Number'** to **'nct_id'** for merging.\n",
    "\n",
    "2. **Merge the datasets**:\n",
    "    - Merge the `usecase` and `eligibilities` datasets on the **'nct_id'** column.\n",
    "    - Use a left join to ensure all records from `usecase` are retained.\n",
    "\n",
    "3. **Save the merged data**:\n",
    "    - Save the merged data to a new CSV file named **'usecase_1_merged.csv'**.\n",
    "    - Do not include the index in the saved file.\n",
    "\n",
    "4. **Confirmation**:\n",
    "    - Print a message to confirm that the column **'criteria'** has been added and the file has been saved."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b5fe97ff-8301-4b2b-9a2c-8564f9912054",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Rename 'NCT Number' in usecase to 'nct_id' for merging\n",
    "usecase.rename(columns={'NCT Number': 'nct_id'}, inplace=True)\n",
    "\n",
    "# Merge the datasets on 'nct_id'\n",
    "merged_data = usecase.merge(eligibilities, on='nct_id', how='left')\n",
    "\n",
    "# Save the merged data to a new CSV\n",
    "merged_data.to_csv('usecase_1_merged.csv', index=False)\n",
    "\n",
    "print(\"The column 'criteria' has been added to usecase_1_.csv and saved as usecase_1_merged.csv.\")\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}