Step #4: Thank God for the egen Command

Introduction

What's so special, really, about the egen (extensions to genereate) command? The answer is that it lets you do lots of things to the data. Things that in other statistical programs might take a lot of commands are possible to do with a couple of egen commands. So this is actually the next phase of data manipulation.

The syntax is pretty simple:

egen <new variable>= <function>(<expression(s)> or <variable(s)>) [, by (<variables>)]

The functions actually determine what the egen command will do. There are many of them, all described in help egen, and the following sectios of this step will describe the use of the most commonly used functions. These examples will hopefully clarify how to use the different functions and how can they help us.

mean()

egen store_mean_price = mean(price), by(store_id)

This example will create a variable in which, for each observation, the value will be the mean price of all observations that have the same store_id. See the figure under rowmean() for a graphic illustration.

One can omit the by option - this will put the mean of the original variable for all observations in the dataset.

Other examples:
egen mean_firm_occupation_wage = mean(wage), by(firm_id occupation_id)

This will put, for each observation, the mean wage of all other observatios with the same firm_id and occupation_id.


rowmean()

The function rowmean also compute means, but instead of computing means of a variable across observations, it compute the mean across variables for each of the observations.

egen mean_score = rowmean(math_score physics_score chemistry_score)

Suppose you had a dataset of students and their scores. This example will simply create a new variable - mean_score - which will hold the mean of math, physics and chemistry score for each of the students.

Note that the fact that it computes it separately for each of the observations makes the by option irrelevant. Take the previous example: There's no meaning to do add a by(class_id) option to the egen command when using the rowmeanfunction. If you want the mean score in class for any of the subjects (mean score across students), you should use the mean() function instead of the rowmean(). If you want the mean score (across students) of the mean score (across subjects), you need to first do mean and then rowmean (or vice versa).

You might ask, what's the difference between the rowmean() and simply using the gen command:

gen mean_score = (math_score + physics_score + chemistry_score) / 3

There are two main differences:
1. You can use wildcards - The same rowmean command can be written like this:

egen mean_score = rowmean(*_score)

This is very useful if the list of valiables is very long, or if you think that later on you might add english_score history_score, and so on, to your dataset, and you don't want to update this command every time. To learn more about wildcards, see help varlist.

2. Missing values - If one of the variables mentioned above is missing, gen command will not be able to sum the three variables and will therefore put missing value for mean_score in that observation. This is the case even if the other two scores are not missing. egen rowmean, on the other hand, will disregard the missing values and compute the mean of only the nonmissing values in the variable list. Only if there are missing values in all of the variables specified, egen rowmean will put missing value in the generated variable (just like the gen command). It's up to you to decide which one is better. Sometimes it is enough to have one missing value in order for the mean value to be irrelevant to what you are measuring, and sometimes you may decide that the mean of the nonmissing variables only is good enough.


This slide shows how the mean() and rowmean() functions work:






Note: Although the observations are sorted according to the by variable (class_id) here, it doesn't mean that you should sort them before. egen doesn't need the dataset to be sorted according to the by variable (although I'm guessing that if it's sorted, it will take less time to process).

sum() and rowtotal(), max() and rowmax(), min() and rowmin()

This is the same as mean and rowmean, but instead of calculating means, these functions calculate sums. Here are some examples:

egen team_effort = sum(effort), by(team)
egen total_correct_answers = rowtotal(question_*)

There is a small difference between rowmean and rowtotal in the way missing values are treated. rowtotal simply treats missing values as zeroes. So even if the values are missing in all the specified variables, the new rowtotal variable will be 0. rowmean would have put missing value instead. If you want to put a missing value there, you might want to do something like that:

egen total_correct_answers = rowtotal(question_*)
egen nonmissing_answers = rownonmiss(question_*) // This function puts the number of variables for which this observation had a missing value
replace total_correct_answers = . if nonmissing_answers == 0
drop nonmissing_answers

For max and min, the function gives the maximum or minimum value, respectively, out of all the values of the specified variable (within a group, if you are specifying the by option). Let's say, for example, that you want to normalize all values of the variable wage in a group so that each observation will have 1 if its wage is the maximum wage in a group, 0 if its wage is the minimum and some number between 0 and 1 if it's in between. The following set of commands will do the job pretty easily:

egen max_wage_f = max(wage), by(firm)
egen min_wage_f = min(wage), by(firm)

gen norm_wage_f = (wage - min_wage_f) / (max_wage_f - min_wage_f)
replace norm_wage_f = 0.5 if norm_wage_f == . // This is for the firms in which the maximum wage equals the minimum wage. The command above will give them missing values, because the denominator equals zero


Here is another example of using egen max in order to populate a nonmissing value to observations with missing values. Suppose you have a dataset of students in schools. In one of your regressions you want to take into account the number of children whose father dropped out of high school. In order to do so, you will the following commands:

egen f_dropout_kids_only = count(student_id) if f_educ<12, by(school) // This will count the number of children for whom the condition applies. Observations in which the condition does not apply, will get a missing value to the generated variable.

egen f_dropout_kids = max(f_dropout_kids_only), by(school)

drop f_dropout_kids_only // Don't get confused. This will drop the variable f_dropout_kids_only that is no longer needed

Although, in general, missing value is greater than any other nonmissing value, the egen max ignores missing values and therefore the observations with the nonmissing value in that school - i.e the kids whose father dropped out - will be given to the other observations in the same school (we need all of them in the regression: both the ones who apply and the ones who doesn't). The following figure might make it clearer.






Note: Instead of egen max, we could have also used egen min or egen mean again as the second command. Both egen min and egen mean ignore missing values and since the nonmissing values are equal for that by category, the functions will yield that same value. This is not true for egen sum, though, because the sum function will multiply the nonmissing value by the number of nonmissing observations.


For additional statistical calculations of the within group, see help egen and look for functions such as sd() (for standard deviation), median(), mode() and others. You can also calculate statistics across a group of variables on an observation-basis (instead of across a group of observations on a variable-basis). See help egen and look for the functions that start with row : rowmean(), rowmin(), rowmax(), rowsd(), rowtotal(), etc.

tag() and group()


These two function are really useful with identifying variables that have more than one observation (group-identifying variables) - the same ones we used before in the by options.

Suppose you have a dataset of gas prices. Each observation has the type of fuel, the price per gallon, the station ID, and week in which the price was recorded. Not all stations were recorded in each of the weeks. That is, in some weeks, some stations didn't have their price taken. In your research, you decide to work only on stations for which you have full data - i.e, those which appear in each of the 50 weeks.

If the data had only one observation per station-and-week combination, you could have just used the count() function of egen:

egen station_count = count(week), by(station) // This will count the number of observations with non-missing values in week, for each value of station, and put the result for each observation of that station.

The problem is that each station-week combination has more than one observation and the number of observations per station-week varies between stations (remember, each station-week has as many observations as the number of fuel types - a price for leaded, unleaded, premium, etc.). Simply counting observations will not work here. We need to "tag" stations: in other words, we will create an indicator (a dummy variable) which will be 1 for only one observation per station, and 0 for all other observations of the same station.

Once we tag each new combination of station-week, we can count how many station-week combinations there are for each station -- this will give us the number of weeks for each station. Although I said we need to count, we will use the sum() function of egen, because count() will add to the count observations with 0, whereas sum will not (which is good because we do not want to count the same station-week more than once):

egen station_week_tag = tag(station week) // We're not using the by option since the group-identifying variables are already in the tag.

egen weeks_of_station = sum(station_week_tag), by(station)

Graphically, this is what we actually do:


The group() function is used in the same manner we use the tag() function, but instead of putting 1 in each new combination and 0 in combinations that it has already seen, it puts 1 for all the observations of the first combination it sees, then 2 for all the observations of the second combination, and so on. The benefit of this function is that you can create a full numeric ordered single variable that enumerates all combinations. When we will deal with loops, it might be clearer why this is good.


Conclusion

The egen command can help you play with the data pretty easily and intuitively (once you get the trick of the function you are using). There are other functions of egen I did not describe here. As I said, you can use the help egen, and don't be afraid to experiment with the functions.

To check whether your function work, browse your dataset. Sort it first by the group variables you mentioned, and then just browse the variables you want to. If you have a large dataset, you can limit the browse command using if conditions or in. Here are two examples:

sort firm

browse firm employee wage min_wage_f max_wage_f norm_wage_f in 2000/2200 // This will browse observations #2000 through #2200

browse firm employee wage *_wage_f if firm >= 100 & firm <= 200

You can do the same with the list command, by the way (but list is limited to the width of the output screen).


Good luck!

(Go on to Step #5)

212 comments:

«Oldest   ‹Older   201 – 212 of 212
Nang T said...

Do you need help to get the right lottery winner numbers ? There is a spell caster called lord Bubuza, he reveal lottery winning numbers with the help of his spell. His spell is super active. My name is Nang T and I won $18.7 million with the help of his spell. Contact Lord Bubuza via WhatsApp: +1 951 442 2214 or email: lordbubuzamiraclework @ hotmail . com

Lace Botter said...

Do you need help get you EX back ? Contact Lord Bubuza today via email:lordbubuzamiraclework@hotmail.com

smith john said...

I have been suffering from Herpes for the past 3 years and 8 months, and ever since then i have been taking series of treatment but there was no improvement until i came across testimonies of Dr Moi on how he has been curing different people from different diseases all over the world, then i contacted him as well. After our conversation he sent me the medicine which i took according to his instructions. When i was done taking the herbal medicine i went for a medical checkup and to my greatest surprise i was cured from Herpes. My heart is so filled with joy. If you are suffering from Herpes or any other disease Dr Moi cures: 1. HIV / AIDS 2. HERPES 1/2 3. CANCER 4. ALS (Lou Gehrig's disease) 5. Hepatitis B 6. chronici pancreatic 7. emphysema 8. COPD (chronic obstructive pulmonary disease, contact him on: Email: dr_moisolutiontemple@yahoo.com Whatsapp/Call him Directly Through +254772376126

Osman Curtis said...

Believe it or not Dr Amber has come to stay to do wonders with his spells. I overheard my co-worker telling his cousin how Dr Amber helped him get back his partner within 24 hours . I was amazed about what he said. I searched for this man called Dr Amber online and guess what... He had so many good reviews online about his work. I got in contact with him to win the lottery and he assured me that winning the lottery is not a problem but what do I intend doing with the winnings. I told him what I will do when I become a winner. Dr Amber prepared a spell and prayed for me for 3 days before he told me where to play the Lottery. I did as he instructed me with a positive mindset. To my greatest shock, I was announced the winner of $1,000 dollars everyday for life which is equivalent to a cash prize of $7,000,000 million dollars.. I have no words to thank Dr Amber but to share your good works to everyone that needs help. Visit: amberlottotemple.com or Email: amberlottotemple@yahoo.com

selinaloggins2@gmail.com said...

MY HERPES CURE WITH NEGATIVE RESULTS.
Take your time to read my article on how I got over the herpes virus so easily, I'm from New York city, USA. I usually see my outbreak 3-4 weeks later. My friend introduced me to a herbal doctor who cures the herpes virus and I witnessed it myself. (doctor's email .. (usmandrhazim@gmail.com) it has been 2 years since I recovered from this virus, no more epidemics, no symptoms of herpes virus, I am so happy to have encountered people healed and I read their testimonial, after taking the herbal medicine I went to my doctor to check if I was totally negative, he confirmed that I am 100% negative for the virus. It is hard to believe that I was cured of herpes.
With Dr Hazim's herbal medicine I tested negative. After using the herbal medicine I went to several labs for a blood test and the result was the same, herpes negative, is it possible that I was cured? Because I thought it was not a cure, I am so happy with the result I have today. I feel wonderful, you can also contact him for treatment.. No pain and my result is negative f after using Dr Hazim Phototherapy medicine. It is true that they say there is no problem with a solution. contact the great doctor for a complete cure from any kind of virus and disease, his email address ..( usmandrhazim@gmail.com)
Whatsapp / call +2349058026857

Anonymous said...

I just experience the wonders of Dr Ibinoba herbs. God will continue to bless you Dr ibinoba more abundantly for the good work you are doing in peoples life by curing them from different type of diseases. I will keep on testifying about your good work. I was living with diabetes type 1, 6 months ago and my doctor told me that there is no cure. I did not believe him and i keep my faith up hoping that one day i will be cured from this horrible disease. One day i saw a post on diabetes forum about a herbal doctor called Dr Ibinoba on how he cure people from diseases with his root and herbs and i quickly contact Dr Ibinoba on his whatsapp and explain my problem to him and he told me not to worry that he is going to cure me and truly he prepared the herbal medicine and send it to me through UPS and gave me instructions on how to take it. After three weeks of taken the herbal medicine , i went to my doctor for checkup and my diabetes result show Negative and my doctor told me that there is no trace of diabetes in my body and the diabetes is completely gone You can also be cured from any type of diseases by contacting Dr Ibinoba on his WhatsApp him on +2348085240869

Unknown said...

Do you have frequent panic attacks?

Then try Tramadol but with caution because its prolonged use may form a habit. So, discuss this with your doctor first. Tramadol 100 mg90 is a central analgesic, which means a strong pain killer.

Tramadol-100-mg-90
Tramadol-100-mg-90 Pills
Tramadol-100-mg
Tramadol Medicines
Tramadol Tablets
Tramadol Pills
Call us: 231-221-2887
For more info:https://72hrspills.com/tramadol-100-mg-90/

johnson juith said...

Hello am Johnson Judith from Canada, i just want to say a very big thanks to Dr Oduwa for saving my soul from dyeing of Herpes Virus have being suffering from this Herpes Virus for two years now, i saw Dr Oduwa email address on a comment of someone who he has helped i was so encourage with the testimony been shared, i took the man email address and also contacted him for help he replied me and ask for some of my details which i provided and also ask for some useful Materials Needed for the herbal herbs which also did not cost me much, i provided all he needed from me to God be blessed this Dr Oduwa a parcel and send it to me followed with the instruction on how to take the herbal herbs medicine for 14 days. So surprisingly after taking the herbal herbs i went to the hospital for checkup to my greatness surprise my result came out (negative) for proof to know if it true, you can contact me on my email address (johnsonjudith53@gmail.com) I will never stop sharing my testimony of how i was cured from my Herpes Virus by Dr Oduwa you can also write him for any type of help you may need (Via dr.oduwaspellhome@gmail.com) here is mobile number (+2348070685053) or whatsaap him for help....

johnson juith said...

I'm 39 years old female I tested genital herpes (HSV1-2) positive in 2019. I was having bad outbreaks. EXTREMELY PAINFUL. I have try different kinds of drugs and treatment by the medical doctors all to know was avail. Six months ago I was desperately online searching for a helpful remedies for genital herpes (HSV1-2) cure, which i come across some helpful remedies on how Dr Oduwa have help so many people in curing genital herpes (HSV1-2) with the help of herbal treatment because I too believe there is someone somewhere in the world who can cure herpes completely. As of the past 2 months, however, I've been following his herpes protocol via dr.oduwaspellhome@gmail.com and it stopped all outbreaks completely! To my greatest surprise I was cured completely by following the protocol of his herbal medicine. Don't be discouraged by the medical doctors. There is a cure for HSV with the help of herbs and roots kindly contact him for more information via dr.oduwaspellhome@gmail.com or Whatsapp him on +2348070685053
u tube/ https://youtu.be/qljIDs95XVg

OLAH DATA SEMARANG said...

Portable STATA 18 Crack Full Version
STATA 18 Crack Full Version
STATA 18 Full Version
Link Download STATA 18 Full Version
https://rutube.ru/video/2eab69d75044eb5856998125e0e71a93

Moma garage said...

God bless Dr Akhigbe for his marvelous work in my life, I was diagnosed of Herpes Virus since over a month now and I was taking my medications, I wasn't satisfied i needed to get the Herpes virus out of my system, I searched about some possible cure for Herpes i saw a comment about Dr Akhigbe, how he cured Herpes with his herbal medicine, I contacted him and he guided me. I asked for solutions, he started the remedy for my health, he sent me the medicine through UPS SPEED POST. I took the medicine as prescribed by him and 7 days later i was cured from Herpes ,thank you for anyone that may need the medication you can contact him for your urgent cure Drakhigbeherbalhome5@gmail.com or Whats App +2349021374574.

johnny said...

getting drugs from the hospital to keep me and my son healthy, it got to a point that i was waiting for death to come because i was broke, one day i heard about this great man called Dr Erickson who is well known for Herpes, kidney stones, and Cancer cure, i decided to message him I didn't believe him that much, I just wanted to give him a try, he replied my mail and Needed some Information about me, then I sent them to him, he prepared a herbal medicine (CURE) and, he gave my details to the Courier Office. they told me that 4-8 days I will receive the package and after receiving it, i took the medicine as prescribed by him at the end of the two weeks, he told me to go to the hospital for a checkup, and i went, surprisingly after the test the doctor confirm me kidney stones free, and my son and i thought it was a joke, i went to other hospital and was also the same, thank you for saving our life's, I promise I will always testify of your good works. If you are c, contact him and I am sure you will get cured, contact him pagegetting drugs from the hospital to keep me and my son healthy, it got to a point that i was waiting for death to come because i was broke, one day i heard about this great man called Dr Erickson who is well known for Herpes, kidney stones, and Cancer cure, i decided to message him I didn't believe him that much, I just wanted to give him a try, he replied my mail and Needed some Information about me, then I sent them to him, he prepared a herbal medicine (CURE) and, he gave my details to the Courier Office. they told me that 4-8 days I will receive the package and after receiving it, i took the medicine as prescribed by him at the end of the two weeks, he told me to go to the hospital for a checkup, and i went, surprisingly after the test the doctor confirm me HSV1, and my son and i thought it was a joke, i went to other hospital and was also the same, thank you for saving our life's, I promise I will always testify of your good works. If you are c, contact him and I am sure you will get cured, contact him pageDr Erickson herbal root getting drugs from the hospital to keep me and my son healthy, it got to a point that i was waiting for death to come because i was broke, one day i heard about this great man called Dr Erickson who is well known for Herpes, kidney stones, and Cancer cure, i decided to message him I didn't believe him that much, I just wanted to give him a try, he replied my mail and Needed some Information about me, then I sent them to him, he prepared a herbal medicine (CURE) and, he gave my details to the Courier Office. they told me that 4-8 days I will receive the package and after receiving it, i took the medicine as prescribed by him at the end of the two weeks, he told me to go to the hospital for a checkup, and i went, surprisingly after the test the doctor confirm me HSV free, and my son and i thought it was a joke, i went to other hospital and was also the same, thank you for saving our life's, I promise I will always testify of your good works. If you are c, contact him and I am sure you will get cured, contact him pageDr Erickson herbal root
via: dr.oduwaspellhome@gmail.com or whatsapp him at +2349169329172.
THESE ARE THE THINGS
HSV1/2
HPV
HERPES
. COLD SORE
. HIV/AIDS
. CANCER
. kidney stones
. LASSA FEVER
via: dr.oduwaspellhome@gmail.com or whatsapp him at +2349169329172.
THESE ARE THE THINGS
. COLD SORE
. HIV/AIDS
. CANCER
. kidney stones
. LASSA FEVER

«Oldest ‹Older   201 – 212 of 212   Newer› Newest»