In the 1960s, the Department of Agricultural Economics at North Carolina State University needed powerful tools to support their research projects. In response, NCSU built a consortium with seven land-grant universities and embarked on a project to develop a general-purpose Statistical Analysis System (SAS). SAS quickly gained popularity beyond academia, becoming the strongest choice for many enterprises to perform statistical analysis and data managements.
However, in recent years, SAS has faced increasing challenges in the rapidly evolving data analytics market. While it remains the preferred choice in highly regulated sectors like finance and pharmaceuticals, its prohibitive subscription fees and steep learning curve have deterred many startups and individual learners. Considering that today's rapid progress in data science has been largely driven by open-source communities with their collaborative approach, SAS's nature as an expensive proprietary software suite has restricted its access to ongoing researches and talent pools in the field. Consequently, SAS has been slower in adopting cutting-edge AI models, making it look even obsolete.
To address these challenges, SAS launched the SAS OnDemand for Academics program. This program provides free access to SAS Studio, a web-based interface that allows users to write and execute SAS programs without installation. This initiative aims to make SAS more accessible to individual learners and foster a new generation of data scientists equipped with SAS programming skills.
First Look at SAS Studio
Let's navigate to SAS OnDemand for Academics. Signing into your account and clicking "Launch" will start a new session. User interface consists of three major parts: the top menu, the work area, and the navigation pane.
- Top Menu: Overall application controls and functionalities of SAS Studio, including:
- Search and open files that are uploaded to your SAS Studio environments.
- Switching back and forth to the SAS Programmer and Visual Programmer perspectives.
- Custom your SAS Studio work environment.
- Navigation Pane: Sections to manage and organize your work files:
- Server Files and Folders: This section allows you to browse and access your files, stored in the SAS Studio environments.
- Tasks and Utilities: Collection of pre-defined tasks and workflows that you can readily employ for common data processing needs.
- Snippets: Collection of code snippets for common data processing tasks. You can also create your own for later use.
- Libraries: Permanently stores and organizes SAS data sets.
- File Shortcuts: Creates and manages file shortcuts.
- Work Area: This is the main space where you can create your SAS programs, each of them in either SAS Programmer or Visual Programmer perspective.
SAS Studio has two different user interface perspectives tailored to different user needs: SAS Programmer and Visual Programmer. The SAS Programmer is the default perspective when you open SAS Studio. It allows users to write, edit, run, and debug SAS codes directly. Program files you created through this perspective will have .sas extension.
On the other hand, the Visual Programmer perspective allows you to build workflows by dragging and dropping files and functionalities (items in the "Tasks and Utilities" section on the left panel). Of course, you can also add your custom SAS program to the Process Flow. Click the + sign on the menu bar under the work area and select SAS Program. Double clicking it will open a text editor where you can write and run your SAS program.
Under the Visual Programmer perspective, you can visually explore and overview the whole process of your data analysis project. Each node is connected in the order of data processing to visually confirm the workflow at a glance. The working files are saved as Process Flow files with an extension .cpf.
How to Upload External Data Files?
SAS Studio is a cloud application. Prior to any data processing, you must first upload data files stored in your local machine. To upload a local file, go to the "Server Files and Folders" section on the navigation pane, and click on "Files." Next, select the destination folder under the "Files" and click "upload" button. This will open the file selection dialog, allowing you to select local data files.
After uploading the file, proceed to create a new SAS data set through either a DATA step or PROC IMPORT. SAS cannot directly process raw data files like CSV or Excel; they need to be imported into a SAS data set, organized with columns and rows. The easiest way is to utilize the PROC IMPORT procedure through the point-and-click interface of SAS Studio.
Right-click on the uploaded data file and select "Import Data."
By default, the imported SAS data set will be saved in the temporal WORK library and named "Import." So, click "Change" and replace the library and data set name. Next, fill in the row number at which data reading should start in the "Start reading data at row."
In a SAS data set, all columns must have appropriate lengths and data types. The "Guessing rows" field determines the number of rows read to determine these attributes. Select a number that is smaller than the entire number of rows, but is reasonably large enough. All things completed, click "Save" and "Run" buttons to start data imports.
SAS Libraries
A SAS library is essentially a storage location for SAS data sets, grouping related data sets under specific names and providing callable references. Usually, SAS libraries are used to organize data sets by projects. Note that, by default, data sets are temporarily stored in the WORK library and will be automatically deleted by the end of the current session.
One of the ways to create a new SAS library is the LIBNAME statement. Here's the basic syntax:
LIBNAME MyData '/path/to/your/library';
In the SAS ODA, your library paths will always begin with '/home/your-user-name/'. You can find the user name for the path at the bottom right corner on your browser.
Alternatively, you can create a library through SAS ODA's graphical interface. Navigate to the "Libraries" section on the left panel, then right click on the "My Libraries." You will fine the "New Library" button:
In the New Library window:
- Name: Specify name for the new library. It should be descriptive, no longer than 8 characters, must start with a Roman alphabet, cannot contain any blanks or special characters other than underscore.
- Path: The directory where the library is located, equivalent to the file path specified in a LIBNAME statement. You may click the Browse button to select the directory for the new library.
- Re-create this Library at start-up: Optionally, you can set SAS to "remember" the new library every time you start a new SAS Studio. This ensures that the library exists in a new session.
Introducing the SAS Programming Language
To have greater control and flexibility over the program, you may consider learning the SAS Programming language. Some might argue that SAS products mostly function as point-and-click programs and there is no need to learn the language. However, while it is possible to perform basic data analysis solely through the menu-driven interface of SAS, learning its language enables users to understand what's going on behind the scenes of data processing, automate repetitive tasks, and customize the analysis program with a greater flexibility.
In essence, a SAS program is a collection of multiple steps (procedures), each of which is either a DATA step or a PROC step. The DATA step creates a new SAS data set referencing raw data sources. On the other hand, the PROC step is responsible for data processing and analysis.
Regardless, every SAS procedure is a block of SAS statements. The SAS statements provide detailed instructions to SAS on how to handle data values. Here's a breakdown of the basic syntax of a SAS statement:
- SAS Keywords: Every SAS statement begins with a keyword that specifies the action you want SAS to undertake. For example, the DATA keyword initiates a new DATA step, PROC PRINT prints the data set most recently created, etc.
- Options/References: Following keywords, you may also include options or references to provide additional details for the action:
- Literals: Numbers or text strings values.
- Variables: Named column vectors in your SAS data set holding data values.
- Expressions: Combinations of variables, literals, and operators.
- Options: Additional specifications modifying the behavior of SAS statements.
- Semicolon: Every SAS statement must conclude with a semicolon. It acts as a signal to SAS that the instruction is complete. Omitting a semicolon is a very common mistake that even experienced SAS programmers often make. So, please double-check to ensure you haven't forgotten a semicolon in your SAS statement.
- SAS keywords and syntax itself are not case sensitive, meaning that they can be either in upper- or lowercase and there is no difference in terms of their functionality.
There really aren't any widely accepted rules about how to format your SAS program. SAS statements can start in any column, continue on the next line (as long as you don't split words in two), be on the same line as other statements. However, neatly organizing your SAS code lines is always helpful, as it makes your program more maintainable.
Adding Comments
Just like with any other programming languages, you can add some comments for the code reviewers and yourself in SAS. There are two main ways to include comments in your SAS program:
- Single-line comments:
- Start the comment with an asterisk followed by a space (* ).
- Any text until encountering a semicolon (;) is considered comment and ignored by SAS.
- Multi-line comments:
- Start the comment with /* followed by a space and end the comment with */.
- Everything between /* and */ is considered a comment, even if it spans multiple lines.
Programming Tips
People who have no experience in any programming language often get frustrated when their programs don't work correctly on the first try. Don't try to tackle a long complicated program all at once. By starting small, building upon what works, and consistently checking your results, you can enhance your programming efficiency.
Even if you get errors, never get frustrated. Surprisingly, experienced SAS programmers could make simple mistakes; they forget to add a semicolon, misspell a word, or place statements in an incorrect order. These small mistakes can cause a whole list of errors. Sometimes, even when programs run without throwing errors, they may still be incorrect. It is always a good practice to test your code with small cases.
0 Comments