A Basic Guide to Using the ExBuilder Analysis Tool
I’m Natalie. Katie showed me how to use the ExBuilder Analysis Tool one time, and I tried to take good notes. My goal in writing the notes was so that I would be able to use the Tool without having to be shown again from scratch how to do it. That’s my disclaimer that these notes might not be detailed enough if you have never used the Tool before. I’ve tried to expand upon a few things, so you can go ahead and try it out, but I just wanted to get the disclaimer out of the way. My other disclaimer is that I don’t do anything incredibly fancy in here. The take-home message is that, while these notes might help you, it is entirely possible, probable even, that you will need to get help using the Tool at some point, whether it’s for fancy stuff or really basic stuff. My final disclaimer is that there is no doubt more than one way to do certain things, and this is just a guide for one of those ways. Be your own person and all that.
I am assuming that you set up your experiment to include the relevant messages for your analyses. For example, you might have a special message sent to the data file every time a new picture goes on the screen, or every time a subject clicks on something. This guide assumes you are happy with the messages in your data files.
You must first take the data files and convert them to .asc files. There’s an ExBuilder icon that will do this conversion drag-and-drop style. Here I must note that if your tags are somewhat long, this conversion process might cut off those tags and cause problems for your analysis. Thankfully, the PC on the big desk near the printer has a special conversion program that fixes this problem and preserves your tags in their full form. My personal preference is to go straight to that computer with my data files and do the conversion there. It should be noted that this computer ("yumyum") does require that you have an account to login. Also, this special conversion program is currently only on Katie's account on that computer, as the offical edftoasc converter is also on that machine. If you want to you can either copy the special converter from Katie's desktop or make a shortcut to it from your account. UPDATE: You will never need to use the special convertor unless your messages are really long. If you need it and you can't figure out which one "yum yum" is, email me and I'll send it to you. -katie
Then you will need to move just the .asc files into a new folder that does not have a space in its name. This file, along with the original experimental script file, will be what you need to use the Analysis Tool. Pop those items on a thumb drive, or stay put and use this computer. But choose wisely, because once you load a database, you can’t really move it to another work station.
Opening the Analysis Tool
Go ahead and open ExAnalysis. It might crash, look warped on the screen for a while, or just take a long time to open. You might have to try opening it again. But once it is open, you will see along the left side that there are many, many data bases loaded. This represents the productivity of your colleagues, and it should make you feel a little bit bad about yourself. That is normal. Be kind, though, and do NOT unload anyone’s database. This is bad. Try your best to ignore the databases of others. Yes, a loaded data base means loaded for all users on that computer. The data base files, .ldf and .mdf, are big and go together, and at last check there was no easy way to transfer them to another computer.
Create New Database
Click on Analysis Databases in the sidebar to get the correct option boxes on the lower left. When it appears, click on Create New Database. You have to click on Experiment Settings, open Settings window, and ADD the script from your experiment. PortName must be in your script for the Analysis Tool to work properly, but don’t ask me exactly what that means.
Before you proceed, click on Save Settings.
Click on Database Name, Import Data, and when given the option, hold down SHIFT to select all your .asc files.
Now the Tool will ask you several questions about your data files. I am not a guru, in the sense that I don’t understand the deeper meaning of many of these questions. That might be nice info for the wiki, but as a non-guru, I’ll try to give practical answers, but you might want to consult an expert or play around with these parameters at some point.
The Tool will ask you whether you want to set a minimum and a maximum trial length. SAY YES. This somewhat equalizes the length of trials in your data. For example, if someone sneezes and fails to click on a picture for thirty seconds, setting a maximum will prevent that one trial from being analyzed all the way out to the thirty second mark. What values you put as min and max will depend on the type of task your experiment involves. I can’t help you there. I will say that I have full-sentence stimuli, and having 3000 ms be the maximum trial length was TOO SHORT.
A strange window will pop up about an .exlog file. Do not panic. JUST SAY NO. I have no idea what this window is about, but say no and proceed.
Confirm Subject Details.
The sample rate is probably 250 Hz. No need to change that.
At this point, questions about the messages in your script/data will appear. Again, I can’t totally explain everything to you.
You will be asked to choose a category for Port Image 1. My advice is to say “pic” or “image.” DO NOT click “always use this category.”
For tag1, put “tag.” Basically, at least for my studies, all Images are “pic” or “image,” and all Tags are “tag.” But I can’t promise that this is appropriate for your study. In the 4pic world, though, this has worked for me.
Now click OK, and it will actually begin to import the data. This might take 10-20 minutes, depending. Do not be alarmed. Just take a break from all this. You deserve it if you’ve made it this far.
Now is the time you might see error messages appear. Those can range from trivial to breathtakingly serious. I can’t help you much with errors, but most of us have dealt with some variety of error in the past, so you’re in good company. In order to proceed, though, I am going to assume that nothing bad happened during importation.
Ultimately what you probably want from the Tool is eye-tracking data aligned to certain stimulus onsets. Am I right? We’re almost there. In the left bar, where you see your database listed, click on the Filtered Data bracket, and click New Filter. Here is another point at which I can’t completely help you. I don’t know what you did and I don’t know what you’re interested in. But here is where you can play around and filter data by conditions. You have to remember what you called things in your script, and it might be helpful to open the script so you can look at it. But, as an example, you might want to type r_cond==”generic” in the box, and that would give you data from the “generic” condition in your study. Always click SAVE. You have to write new filters from scratch to avoid overwriting old filters, by the way, so be careful.
When you have the filters set up the way you want, you can go to Look Count Data, which is the prize. You want to always make your selections from left to right in the window, because of a quirk with the Tool.
- Column of Interest is the y-axis of a graph you might want, so often in eye-tracking, this will be proportion of looks. So a good selection to make would be LP_tag, which is the left pupil. Do not be fooled by LP_image. This will not give you proportion of looks.
IMPORTANT NOTE: If you track both eyes, Exbuilder WILL NOT copy the data from the better eye over the data from the worse eye, and some of your subjects might have a better track from the left eye and some a better track from the right eye. You can't select the best eye for all participants if you track both eyes, unless you do something special to fix this problem on your own. It can't be done automatically in Exbuilder, which is why I only track the better eye to begin with, unless both are "GOOD" according to the eyelink. If you lock out any eye that is not "GOOD" when you are running the study, Exbuilder will copy the data from the good eye to any one that wasn't tracked. So, you can track the right eye of subject 1 and the left eye of subject 2, etc, and then analyze LP_tag or RP_tag -- as long as you don't track an eye that isn't "GOOD" you will always have data that is pretty good. If you have personal reasons for not wanting to just track one eye, or you already did the study and you tracked both eyes, then you will have to write a script to edit the asc files if you want to make sure you get the best data from each file. -Katie
Group By is the x-axis. You might be tempted to select “time” here. Resist. What you probably want instead is time_rel_blank where “blank” is an onset of a particular stimulus. For example, I have often used time_rel_N, where my experiment was set up to send a message every time a Noun was in the sound file stimulus. So this way, you align all trials to the onset of the Noun. This is the real payoff of ExAnalysis, by the way. The whole point is to get this aligned data.
That said, as soon as a spreadsheet-like list of numbers appears for your filters and for your alignments, you should highlight that whole thing and copy it and paste it into Excel. That’s my advice, anyway. Do this immediately, because while everything else about your database will be saved, the Look Count Data will NOT be. So snatch all the numbers and save them to an excel file. If you are desperate for a graph, and can’t wait to make your own in Excel, then you can look at the one given to you here. You can even zoom in on it, right click it to save the image, and print it out. I recommend this if you need a quick fix, but don’t plan on using those graphs publicly.
That’s about it! I can’t help you with much more than this. Again, you might want to watch someone do it once and then use these notes as a reminder. Or you might need someone else to explain ad nauseam the details and minutiae. Chances are you probably know who the most detail-oriented people in the lab are, and you can ask them if you have specific questions. All mistakes here are mine, but Katie should get the credit for showing me how to use the Tool. If you have anything to add to this tutorial, feel free!
Analysis Advice Column
Okay this section is not edited yet, but it's part of a correspondence between Katie and myself that some people might find important. Details later, I hope, but it pertains to formatting data for subject/item ANOVA:
- So this is about my databases that HAVE worked already...for the other 4pic classifier studies: I have gotten some nice eye-movement graphs that are aligned to the Noun onset. The graphs look good, but now I want to actually analyze the data. My impression was that I should divide the time-span into windows of about 200ms. Then For each of those windows, do an ANOVA (or something fancier but that's not the point). The problem I'm finding is that I thought the way to do this would be to have, for each subject, a proportion of looks to the 4 items for that window for each item, so: Subject 1, Item 1, Time_Window 1, 50% looks to target, 20% looks to competitor, 15% looks to other1, 15% looks to other2 Then I could run an ANOVA or something.
But the ExAnalysis filter has left me with no subject or item info. I only used a filter on the condition, and it did not leave me with any subject or item # tags! Which seems weird! It looks like it's all averaged over subjects and items, so what I have is only for each condition, a time stamp and a % of looks to each of the four items. I don't feel like I can run an appropriate analysis on that. Can I? How do I either a) get the subject/item info from ExAnalysis but preserve the Noun-onset alignment, or b) do an appropriate analysis without that information? Do you know dude? THANKS, Natalie
Dear Natalie, know exactly what you mean dude. The data shown in the look count window is always averaged by subjects and items together so it is totally impossible to officially analyze it, since you can't get the averages for the individual subjects, and also you can't add subject or item as a random effect to any regression model. It is poorly designed.
I have dealt with this before though, when I was doing an ANOVA on 4 pic data. You have to export the data by subject, which is not that hard. But, I'm not sure how I aligned the data when I did this. I think if you line it up with the noun onset selected in the look count window and THEN export the data from that window, by subject, that will work. After you export the data it should be saved in different .csv files with the subject names in the files. You should be able to tell by looking at the exported files if they are aligned or not because if they are, then the initial values in the time column will be negative, otherwise they will start at 0.
Then annoying thing is that you have to do this separately for each condition. So for example, in my experiment it was 2x2, so there were 4 different "conditions" (so 4 filters) and I had to export the data separately by subject for each one. So then there were 4 files for each subject. Also, for each subject in each condition I wanted the averages for different segments (0-200ms, 200-400, 400-600 are what I used). It sounds like this is what you want to do, too.
So after you export the files then you can calculate the averages for each subject for each condition for each time segment, for the ANOVA. But you have to put it into excel and then do some things to it to get the averages -- exbuilder doesn't average anything on it's own. I know I had some macros that I used to do some of it, but it's hard to describe exactly what I did over email. In the end, after you get the averages you can copy and paste them all into one file, and then that will be what you use for the analysis. If you want we can meet in person and I can try to show you what I did. That might be more helpful than this email. I don't know.
It's kind of hard to believe what we have to go through to use the stupid analysis tools. But anyway, there IS a way to do it, it just isn't as easy as it should be unfortunately...
SQL Database Information
One further thing you might want to do once you have done some basic manipulations in ExAnal is to create a database that preserves subject and item info. The cut-and-paste method described above will give you pretty graphs that are representative of pooled data. This will NOT necessarily match the by-subjects or by-items averages you really wish to compare. You will notice, possibly with alarm, that there IS NO subject OR item information in your data file after performing the steps above. The previous section made some mention of this, but now I am offering you a real solution based on some wisdom that has trickled down my way.
There are, in fact, TWO solutions. The first one is to export each subject's data file individually from ExAnal. To do this you will
Go into the ExAnal Look Count menu and create the look count situation you want, as detailed above.
Click on the EXPORT button.
Select SubjectName or something similar (or something like ItemName if you want to do an item analysis) as the by-column.
Leave the other by-column blank (as far as I know).
This will generate you a separate .csv file for EACH subject (or item). Then you must concatenate by hand after inserting an additional column of data to indicate the subject name or item number. I cannot offer many details about this method, since I do not really do things this way.
THE SQL SOLUTION
Another thing you can do to capture all the information you could possibly want into one file is to use SQL databases. It is a little quirky but I will relay the wisdom as I have learned it in the following easy steps:
Create your ideal data filter in Filtered Data. The hack way I do things involves making a separate filter for each condition in my study, so I'll just pretend you have not found a cleverer way to do things. So make your ideal filter for one condition. While doing this, you should also only select the check-boxes in the left-handed menu for things you want COLUMNS for in your final data file. For example, you might not really need an entire column of Right_X (which I believe are coordinates for the right pupil). So de-select it and then it will not appear in your file. Of course, you are welcome to keep EVERYthing and filter later. Your call.
Click on the little hyperlink that says "Generated SQL." It does NOT appear to have done anything. This is merely a trick. It has, in fact, done something.
Now ignore ExAnal and open a program called Microsoft SQL Server Management Studios Express.
Click on "Connect"
Select your database from the side bar. If it isn't there, you are screwed.
Click on "New Query"
Now you just go to the Edit menu and select "paste." You are probably thinking that there isn't anything to paste, right? WRONG. If you did this correctly, some amazing syntax will magically appear in your query. That comes from when you clicked the strange hyperlink in ExAnal and thought nothing had happened. If you have copied anything in the intervening period, the syntax will not appear, so go back to ExAnal and click that button again on the Filtered Data of your choice.
Now before you do anything else, you must go to the Query menu, then select "Query Options." Choose "text" from the menu on the left. Now you can change the text to display as a comma delimited file. You might want something else, but I like comma delimited. If you do multiple queries in one session, you WILL need to keep changing this in the menu each time.
Go back to the Query menu again and select "Results to...File"
Again, it appears to have done nothing, but we now realize that this impression is false when working with SQL databases. Now you want to click on the "!Execute" button or select "!Execute" from the Query menu. It will let you save the file now, as an .rpt (report file). This is good.
You can open the file in WordPad or another text editor, but I recommend you open it in Text Wrangler, get rid of any carriage returns or formatting quirks that will make importing the data (into R) difficult, and save as a UNIX file. Your choice, though. I'm not the boss of you. If you have problems importing the file, another thing that might have happened is that the labels for the columns got out of alignment due to some of the items in the data file or due to formatting/punctuation marks. That happened to me once, and I had to just add additional column headers. If this is your problem, you should get a "header number" type error when trying to import the file into R.
Hope this helps! Now you should have one big data file that has subj and item info preserved, is comma-separated, and has the columns you selected in the filtered data portion of ExAnal. Now you can go forward with your analysis.
NOTE: As of October 2015, probably due to an SQL Server update, the following characters and words should not be used as (parts of) filter names: