tag:blogger.com,1999:blog-42063922477469302562024-03-18T20:29:08.159-07:00The Tech FeastA Glimpse at the World of Computer ScienceHiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.comBlogger133125tag:blogger.com,1999:blog-4206392247746930256.post-76172362667140987422015-06-20T15:40:00.001-07:002015-06-20T15:40:05.107-07:00Expose Any Shell Command or Script as a Web API<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
I implemented a tool that can expose any shell command or script as a simple web API. All you have to specify is the binary (command/script) that needs to be exposed, and optionally a port number for the HTTP server. Source code of the tool in its entirety is shown below. In addition to exposing simple web APIs, this code also shows how to use Golang's built-in logging package, slice to varargs conversion and a couple of other neat tricks.</div>
<pre class="brush:cpp">// This tool exposes any binary (shell command/script) as an HTTP service.
// A remote client can trigger the execution of the command by sending
// a simple HTTP request. The output of the command execution is sent
// back to the client in plain text format.
package main
import (
"flag"
"fmt"
"io/ioutil"
"log"
"net/http"
"os"
"os/exec"
"strings"
)
func main() {
binary := flag.String("b", "", "Path to the executable binary")
port := flag.Int("p", 8080, "HTTP port to listen on")
flag.Parse()
if *binary == "" {
fmt.Println("Path to binary not specified.")
return
}
l := log.New(os.Stdout, "", log.Ldate|log.Ltime)
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
var argString string
if r.Body != nil {
data, err := ioutil.ReadAll(r.Body)
if err != nil {
l.Print(err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
argString = string(data)
}
fields := strings.Fields(*binary)
args := append(fields[1:], strings.Fields(argString)...)
l.Printf("Command: [%s %s]", fields[0], strings.Join(args, " "))
output, err := exec.Command(fields[0], args...).Output()
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "text/plain")
w.Write(output)
})
l.Printf("Listening on port %d...", *port)
l.Printf("Exposed binary: %s", *binary)
http.ListenAndServe(fmt.Sprintf("127.0.0.1:%d", *port), nil)
}</pre>
<div style="text-align: justify;">
Clients invoke the web API by sending HTTP GET and POST requests. Clients can also send in additional flags and arguments to be passed into the command/script wrapped within the web API. Result of the command/script execution is sent back to the client as a plain text payload.</div>
<div style="text-align: justify;">
As an example, assume you need to expose the "date" command as a web API. You can simply run the tool as follows:</div>
<pre>./bash2http -b date</pre>
<div style="text-align: justify;">
Now, the clients can invoke the API by sending an HTTP request to http://host:8080. The tool will run the "date" command on the server, and send the resulting text back to the client. Similarly, to expose the "ls" command with the "-l" flag (i.e. long output format), we can execute the tool as follows:</div>
<pre>./bash2http -b "ls -l"</pre>
<div style="text-align: justify;">
Users sending an HTTP request to http://host:8080 will now get a file listing (in the long output format of course), of the current directory of the server. Alternatively users can POST additional flags and a file path to the web API, to get a more specific output. For instance:</div>
<pre>curl -v -X POST -d "-h /usr/local" http://host:8080</pre>
<div style="text-align: justify;">
This will return a file listing of /usr/local directory of the server with human-readable file size information.</div>
<div style="text-align: justify;">
You can also use this tool to expose custom shell scripts and other command-line programs. For example, if you have a Python script foo.py which you wish to expose as a web API, all you have to do is:</div>
<pre>./bash2http -b "python foo.py"</pre>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com2tag:blogger.com,1999:blog-4206392247746930256.post-32597042581695610332015-06-08T11:59:00.002-07:002015-06-08T11:59:24.114-07:00Exposing a Local Directory Through a Web Server<div style="text-align: justify;">
Did you ever encounter a situation where you have to serve the contents of a directory in the local file system through a web server? Usually this scenario occurs when you want to quickly try out some HTML+JS+CSS combo, or when you want to temporarily share the directory with a remote user. How would you go about doing this? Setting up Apache HTTP server or something similar could take time. And you definitely don't want to be writing any new code for achieving such a simple goal. Ideally, what you want is a simple command, that when executed starts serving the current directory through a web server.</div>
<div style="text-align: justify;">
The good news is, if you have Python installed on your machine, you already have access to such a command:</div>
<pre>python -m SimpleHTTPServer 8000</pre>
<div style="text-align: justify;">
The last argument (8000) is the port number for the HTTP server. This will spawn a lightweight HTTP server, using the current directory as the document root. Hit ctrl+c to kill the server process when you're done with it.</div>
<div style="text-align: justify;">
Alternatively you can write your own solution, and install it permanently into the system so you reuse it in the future. Here's a working solution written in Go:</div>
<pre class="brush:cpp">package main
import (
"log"
"net/http"
)
func main() {
log.Fatal(http.ListenAndServe(":8080", http.FileServer(http.Dir("."))))
}</pre>
<div style="text-align: justify;">
The port number (8080) is hardcoded into the solution, but it's not that hard to change it.</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-13816123629970447192015-05-13T19:19:00.001-07:002015-05-13T19:21:03.510-07:00Using Java Thread Pools<div style="text-align: justify;">
Here's a quick (and somewhat dirty) solution in Java to process a set of tasks in parallel. It does not require any third party libraries. Users can specify the tasks to be executed by implementing the Task interface. Then, a collection of Task instances can be passed to the TaskFarm.processInParallel method. This method will farm out the tasks to a thread pool and wait for them to finish. When all tasks have finished, it will gather their outputs, put them in another collection, and return it as the final outcome of the method invocation.
</div>
<div style="text-align: justify;">
This solution also provides some control over the number of threads that will be employed to process the tasks. If a positive value is provided as the max argument, it will use a fixed thread pool with an unbounded queue to ensure that no more than 'max' tasks will be executed in parallel at any time. By specifying a non-positive value for the max argument, the caller can request the TaskFarm to use as many threads as needed.</div>
<div style="text-align: justify;">
If any of the Task instances throw an exception, the processInParallel method will also throw an exception.</div>
<pre class="brush:java">
package edu.ucsb.cs.eager;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import java.util.concurrent.*;
public class TaskFarm<T> {
/**
* Process a collection of tasks in parallel. Wait for all tasks to finish, and then
* return all the results as a collection.
*
* @param tasks The collection of tasks to be processed
* @param max Maximum number of parallel threads to employ (non-positive values
* indicate no upper limit on the thread count)
* @return A collection of results
* @throws Exception If at least one of the tasks fail to complete normally
*/
public Collection<T> processInParallel(Collection<Task<T>> tasks, int max) throws Exception {
ExecutorService exec;
if (max <= 0) {
exec = Executors.newCachedThreadPool();
} else {
exec = Executors.newFixedThreadPool(max);
}
try {
List<Future<T>> futures = new ArrayList<>();
// farm it out...
for (Task<T> t : tasks) {
final Task<T> task = t;
Future<T> f = exec.submit(new Callable<T>() {
@Override
public T call() throws Exception {
return task.process();
}
});
futures.add(f);
}
List<T> results = new ArrayList<>();
// wait for the results
for (Future<T> f : futures) {
results.add(f.get());
}
return results;
} finally {
exec.shutdownNow();
}
}
}</pre>Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-15463717215558154332015-05-05T16:34:00.000-07:002015-05-05T16:36:23.951-07:00Parsing Line-Oriented Text Files Using Go<div style="text-align: justify;">
The following example demonstrates several features of Golang, such as reading a file line-by-line (with error handling), deferred statements and higher order functions.</div>
<pre class="brush:cpp">
package main
import (
"bufio"
"fmt"
"os"
)
func ParseLines(filePath string, parse func(string) (string,bool)) ([]string, error) {
inputFile, err := os.Open(filePath)
if err != nil {
return nil, err
}
defer inputFile.Close()
scanner := bufio.NewScanner(inputFile)
var results []string
for scanner.Scan() {
if output, add := parse(scanner.Text()); add {
results = append(results, output)
}
}
if err := scanner.Err(); err != nil {
return nil, err
}
return results, nil
}
func main() {
if len(os.Args) != 2 {
fmt.Println("Usage: line_parser <path>")
return
}
lines, err := ParseLines(os.Args[1], func(s string)(string,bool){
return s, true
})
if err != nil {
fmt.Println("Error while parsing file", err)
return
}
for _, l := range lines {
fmt.Println(l)
}
}
</pre>
<div style="text-align: justify;">
The ParseLines function takes a path (<i>filePath</i>) to an input file, and a function (<i>parse</i>) that will be applied on each line read from the input file. The parse function should return a [string,boolean] pair, where the boolean value indicates whether the string should be added to the final result of ParseLines or not. The example shows how to simply read and print all the lines of the input file.</div>
<div style="text-align: justify;">
The caller can inject more sophisticated transformation and filtering logic into ParseLines via the parse function. The following example invocation filters out all the strings that do not begin with the prefix "[valid]", and extracts the 3rd field from each line (assuming a simple whitespace separated line format).</div>
<pre class="brush:cpp">
lines, err := ParseLines(os.Args[1], func(s string)(string,bool){
if strings.HasPrefix(s, "[valid] ") {
return strings.Fields(s)[2], true
}
return s, false
})
</pre>
<div style="text-align: justify;">
A function like ParseLines is suitable for parsing small to moderately large files. However, if the input file is very large, ParseLines may cause some issues, since it accumulates the results in memory.
</div>Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-50678683344618819352015-03-20T10:41:00.000-07:002015-03-20T10:42:18.524-07:00QBETS: A Time Series Analysis and Forecasting Method<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal" style="text-align: justify;">
Today I’m going to share some details on an analytics
technology I’ve been using for my research.</div>
<div class="MsoNormal" style="text-align: justify;">
<a href="http://dl.acm.org/citation.cfm?id=1791551.1791556&coll=DL&dl=GUIDE&CFID=636697972&CFTOKEN=10376914">QBETS</a> (Queue Bounds Estimation from Time Series) is a
non-parametric time series analysis method. The basic idea behind QBETS is to
analyze a time series, and predict the p-th percentile of it, where p is a
user-specified parameter. QBETS learns from the existing data points in the
input time series, and estimates a p-th percentile value such that the next
data point in the time series has a 0.01p probability of being less than or
equal to the estimated value.</div>
<div class="MsoNormal" style="text-align: justify;">
For example, suppose we have the following input time
series, and we wish to predict the 95<sup>th</sup> percentile of it:<o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
A<sub>0</sub>, A<sub>1</sub>, A<sub>2</sub>, …., A<sub>n</sub><o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
If QBETS predicts the value Q as the 95<sup>th</sup>
percentile, we can say that A<sub>n+1</sub> (the next data point that will be
added to the time series by the generating process), has a 95% chance of being
less than or equal to Q. <o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
P(A<sub>n+1</sub> <= Q) = 0.01p<o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Since QBETS cannot determine the percentile values exactly,
but must estimate them, it uses another parameter c (0 < c < 1) as an
upper confidence bound on the estimated values. That is, if QBETS was used to
estimate the p-th percentile value of a time series with c upper confidence, it
would have overestimated the p-th percentile with a probability of 1 – c. For
instance, if c = 0.05, then QBETS will generate predictions that overestimate
the true p-th percentile 95% of the time. We primarily use parameter c as a
means of controlling how conservative we want QBETS to be, when predicting
percentiles.</div>
<div class="MsoNormal" style="text-align: justify;">
QBETS also supports a technique known as change point
detection. To understand what this means, let’s look at the following input
time series.<o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
7, 8, 7, 7, 9, 8, 7, 7, <b style="mso-bidi-font-weight: normal;">15,
15, 16, 14, 16, 17,15</b><o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
Here we see a sudden shift in the values after the first 8
data points. The individual data point values have increased from the 7-9 range
to 14-17 range. QBETS detects such change points in the time series, and takes
action to discard the data points before the change point. This is necessary to
make sure that the predictions are not influenced by old historical values that
are no longer relevant in the time series generating process.</div>
<div class="MsoNormal" style="text-align: justify;">
The paper that originally introduced QBETS, used it as a
mechanism to predict the scheduling delays in batch queuing systems for
supercomputers and other HPC systems. Over the years researchers have used QBETS
with a wide range of datasets, and it has produced positive results in almost
all the cases. Lately, I have been using QBETS as a means of predicting API
response times, by analyzing historical API performance data. Again, the
results have been quite promising.</div>
<!--[if gte mso 9]><xml>
<o:DocumentProperties>
<o:Revision>0</o:Revision>
<o:TotalTime>0</o:TotalTime>
<o:Pages>1</o:Pages>
<o:Words>407</o:Words>
<o:Characters>2321</o:Characters>
<o:Company>UC Santa Barbara</o:Company>
<o:Lines>19</o:Lines>
<o:Paragraphs>5</o:Paragraphs>
<o:CharactersWithSpaces>2723</o:CharactersWithSpaces>
<o:Version>14.0</o:Version>
</o:DocumentProperties>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--><br />
<div class="MsoNormal" style="text-align: justify;">
To learn more about QBETS, go through the <a href="http://dl.acm.org/citation.cfm?id=1791551.1791556&coll=DL&dl=GUIDE&CFID=636697972&CFTOKEN=10376914">paper</a> or contact
the authors.<o:p></o:p></div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-68602357771531703082015-01-11T14:44:00.006-08:002015-01-11T14:44:57.804-08:00Creating Eucalyptus Machine Images from a Running VM<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
I often use <a href="https://www.eucalyptus.com/">Eucalyptus</a> private cloud platform for my research. And very often I need to start Linux VMs in Eucalyptus, and install a whole stack of software on them. This involves a lot of repetitive work, so in order to save time I prefer creating machine images (EMIs) from fully configured VMs. This post outlines the steps one should follow to create an EMI from a VM running in Eucalyptus (tested on Ubuntu Lucid and Precise VMs).</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<b>Step 1: SSH into the VM running in Eucalyptus, if you already haven't.</b></div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<b>Step 2: Run euca-bundle-vol command to create an image file (snapshot) from the VM's root file system.</b></div>
<blockquote class="tr_bq" style="text-align: justify;">
euca-bundle-vol -p root -d /mnt -s 10240</blockquote>
<div style="text-align: justify;">
Here "-p" is the name you wish to give to the image file. "-s" is the size of the image in megabytes. In the above example, this is set to 10GB, which also happens to be the largest acceptable value for "-s" argument. "-d" is the directory in which the image file should be placed. Make sure this directory has enough free space to accommodate the image size specified in "-s". </div>
<div style="text-align: justify;">
This command may take several minutes to execute. For a 10GB image, it may take around 3 to 8 minutes. When completed, check the contents of the directory specified in argument "-d". You will see an XML manifest file and a number of image part files in there.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<b>Step 3: Upload the image file to the Eucalyptus cloud using the euca-upload-bundle command.</b></div>
<blockquote class="tr_bq" style="text-align: justify;">
euca-upload-bundle -b my-test-image -m /mnt/root.manifest.xml</blockquote>
<div style="text-align: justify;">
Here "-b" is the name of the bucket (in Walrus key-value store) to which the image file should be uploaded. You don't have to create the bucket beforehand. This command will create the bucket if it doesn't already exist. "-m" should point to the XML manifest file generated in the previous step.</div>
<div style="text-align: justify;">
This command requires certain environment variables to be exported (primarily access keys and certificate paths). The easiest way to do that is to copy your eucarc file and the associated keys into the VM and source the eucarc file into the environment.</div>
<div style="text-align: justify;">
This command also may take several minutes to complete. At the end, it will output a string of the form "bucket-name/manifest-file-name".</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<b>Step 4: Register the newly uploaded image file with Eucalyptus.</b></div>
<blockquote class="tr_bq" style="text-align: justify;">
euca-register my-test-image/root.manifest.xml</blockquote>
<div style="text-align: justify;">
The only parameter required here is the "bucket-name/manifest-file-name" string returned from the previous step. I've noticed that in some cases, running this command from the VM in Eucalyptus doesn't work (you will get an error saying 404 not found). In that case you can simply run the command from somewhere else -- somewhere outside the Eucalyptus cloud. If all goes well, the command will return with an EMI ID. At this point you can launch instances of your image using the euca-run-instances command.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-27914771160474260482015-01-02T15:20:00.001-08:002015-01-02T15:20:34.341-08:00Developing Web Services with Go<div style="text-align: justify;">
Golang facilitates implementing powerful web applications and services using a very small amount of code. It can be used to implement both HTML rendering webapps as well as XML/JSON rendering web APIs. In this post, I'm going to demonstrate how easy it is to implement a simple JSON-based web service using Go. We are going to implement a simple addition service, that takes two integers as the input, and returns their sum as the output.</div>
<pre class="brush:cpp">
package main
import (
"encoding/json"
"net/http"
)
type addReq struct {
Arg1,Arg2 int
}
type addResp struct {
Sum int
}
func addHandler(w http.ResponseWriter, r *http.Request) {
decoder := json.NewDecoder(r.Body)
var req addReq
if err := decoder.Decode(&req); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
jsonString, err := json.Marshal(addResp{Sum: req.Arg1 + req.Arg2})
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.Header().Set("Content-Type", "application/json")
w.Write(jsonString)
}
func main() {
http.HandleFunc("/add", addHandler)
http.ListenAndServe(":8080", nil)
}
</pre>
<div style="text-align: justify;">
Lets review the code from top to bottom. First we need to import the JSON and HTTP packages into our code. The JSON package provides the functions for parsing and marshaling JSON messages. The HTTP package enables processing HTTP requests. Then we define two data types (addReq and addResp) to represent the incoming JSON request and the outgoing JSON response. Note how addReq contains two integers (Arg1, Arg2) for the two input values, and addResp contains only one integer (Sum) for holding the total.</div>
<div style="text-align: justify;">
Next we define what is called a HTTP handler function which implements the logic of our web service. This function simply parses the incoming request, and populates an instance of the addReq struct. Then it creates an instance of the addResp struct, and serializes it into JSON. The resulting JSON string is then written out using the http.ResponseWriter object.</div>
<div style="text-align: justify;">
Finally, we have a main function that ties everything together, and starts executing the web service. This main function, simply registers our HTTP handler with the "/add" URL context, and starts an HTTP server on port 8080. This means any requests sent to the "/add" URL will be dispatched to the addHandler function for processing.</div>
<div style="text-align: justify;">
That's all there's to it. You may compile and run the program to try it out. Use Curl as follows to send a test request.</div>
<pre>
curl -v -X POST -d '{"Arg1":5, "Arg2":4}' http://localhost:8080/add
</pre>
<div style="text-align: justify;">
You will get a JSON response back with the total.</div>Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-23533540969656004242014-12-03T16:17:00.003-08:002014-12-03T16:20:49.872-08:00Controlled Concurrency with Golang<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
Lately I have been doing a lot of programming in Golang. It is one of those languages which is somewhat difficult to fully grasp at the beginning. But a few hundred lines of code later, you feel like you cannot get enough of it -- very simple syntax, brilliant performance and very clean and precise API semantics. This language has got it all.</div>
<div style="text-align: justify;">
Concurrent programming is one area where Golang really excels at. The goroutines that make it trivial to start concurrent threads of execution, channels as a first-class programming construct and a plethora of built-in utilities and packages (e.g. sync) make the developer's life a lot easier. In this post I'm going to give a brief overview on how to instantiate new threads of execution in Golang. Lets start with a piece of sequential code:</div>
<pre class="brush: cpp">for i := 0; i < len(array); i++ {
doSomething(array[i])
}
</pre>
<div style="text-align: justify;">
Above code iterates over an array, and calls the function doSomething for each element in the array. But this code is sequential, which means doSomething(n) won't be called until doSomething(n-1) returns. Suppose you want to speed things up a little bit by running multiple invocations of doSomething in parallel (assuming it is safe to do so -- both control and data wise). In Golang this is all you have to do:</div>
<pre class="brush: cpp">for i := 0; i < len(array); i++ {
go doSomething(array[i])
}
</pre>
<div style="text-align: justify;">
The go keyword will start the doSomething function as a separate concurrent goroutine. But this code change causes an uncontrolled concurrency situation. In other words, the only thing that's limiting the number of parallel goroutines spawned by the program is the length of the array, which is not a good idea if the array has thousands of entries. Ideally, we need to put some kind of a fixed cap on how many goroutines are spawned by the loop. This can be easily achieved by using a channel with a fixed capacity.</div>
<pre class="brush: cpp">c := make(chan bool, 8)
for i := 0; i < len(array); i++ {
c <- true
go func(index int){
doSomething(index)
<- c
}(i)
}
</pre>
<div style="text-align: justify;">
We start by creating a channel that can hold at most 8 boolean values. Then inside the loop, whenever we spawn a goroutine, we first send a boolean value (true) into the channel. This operation will get blocked if the channel is already full (i.e it has 8 elements). Then in the goroutine, we remove an element from the channel before we return. This little trick makes sure that at most 8 parallel goroutines will be active in the program at any given time. If you need to change this limit, you simply have to change the max capacity of the channel. You can set this to a fixed number, or write some code to figure out the optimal value based on the number of CPU cores available in the system.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-14291410906333404632014-11-22T19:36:00.000-08:002014-11-22T19:36:51.152-08:00Running Python from Python<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
It has been pointed out to me that I don't blog as often as I used to. So here's a first step towards rectifying that.</div>
<div style="text-align: justify;">
In this post, I'm going to briefly describe the support that Python provides for processing, well, "Python". If you're using Python for simple scripting and automation tasks, you might often have to load, parse and execute other Python files from your code. While you can always "import" some Python code as a module, and execute it, in many situations it is impossible to determine precisely at the development time, which Python files your code needs to import. Also some Python scripts are written as simple executable files, which are not ideal for inclusion via import. To deal with cases such as these, Python provides several built-in features that allow referring to and executing other Python files.</div>
<div style="text-align: justify;">
One of the easiest ways to execute an external Python file is by using the built-in <a href="https://docs.python.org/2/library/functions.html#execfile">execfile</a> function. This function takes the path to another Python file as the only mandatory argument. Optionally, we can also provide a global and a local namespace. If provided, the external code will be executed within those namespace contexts. This is a great way to exert some control over how certain names mentioned in the external code will be resolved (more on this later).</div>
<pre class="brush:python">execfile('/path/to/code.py')
</pre>
Another way to include some external code in your script is by using the built-in <a href="https://docs.python.org/2/library/functions.html#__import__">__import__ </a>function. This is the same function that gets called when we use the usual "import" keyword to include some module. But unlike the keyword, the __import__ function gives you lot more control over certain matters like namespaces.<br />
<div style="text-align: justify;">
Another way to run some external Python code from your Python script is to first read the external file contents into memory (as a string), and then use the <a href="https://docs.python.org/2/reference/simple_stmts.html#exec">exec</a> keyword on it. The exec keyword can be used as a function call or as keyword statement.</div>
<pre class="brush:python">code_string = load_file_content('/path/to/code.py')
exec(code_string)
</pre>
<div style="text-align: justify;">
Similar to the execfile function, you have the option of passing custom global and local namespaces. Here's some code I've written for a project that uses the exec keyword:</div>
<pre class="brush:python">globals_map = globals().copy()
globals_map['app'] = app
globals_map['assert_app_dependency'] = assert_app_dependency
globals_map['assert_not_app_dependency'] = assert_not_app_dependency
globals_map['assert_app_dependency_in_range'] = assert_app_dependency_in_range
globals_map['assert_true'] = assert_true
globals_map['assert_false'] = assert_false
globals_map['compare_versions'] = compare_versions
try:
exec(self.source_code, globals_map, {})
except Exception as ex:
utils.log('[{0}] Unexpected policy exception: {1}'.format(self.name, ex))
</pre>
Here I first create a clone of the current global namespace, and pass it as an argument to the exec function. The clone is discarded at the end of the execution. This makes sure that the code in the external file does not pollute my existing global namespace. I also add some of my own variables and functions (e.g assert_true, assert_false etc.) into the global namespace clone, which allows the external code to refer to them as built-in constructs. In other words, the external script can be written in a slightly extended version of Python.
<br />
<div style="text-align: justify;">
There are other neat little tricks you can do using the constructs like exec and execfile. Go through the official documentation for more details.</div>
</div>Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-81343013858579752052014-05-14T11:20:00.000-07:002014-05-14T11:21:47.994-07:00Java Code Analysis and Optimization with Soot<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
This is a quick shout out about the project <a href="http://www.sable.mcgill.ca/soot/">Soot</a>. If you're doing anything even remotely related to static analysis in Java, Soot is the way to go. It's simple, open source, well documented and extremely powerful. Soot can analyze any Java program (source or bytecode), and provide you with the control flow graph (CFG). Here's an example that shows how to construct the CFG for the main method of a class named MyClass.</div>
<pre class="brush:java">SootClass c = Scene.v().loadClassAndSupport("MyClass");
c.setApplicationClass();
SootMethod m = c.getMethodByName("main");
Body b = m.retrieveActiveBody();
UnitGraph g = new BriefUnitGraph(b);
</pre>
<div style="text-align: justify;">
Once you get your hands on the CFG, you can walk it, search it and do anything else you would normally do with a graph data structure. </div>
<div style="text-align: justify;">
Soot converts Java code into one of four intermediate representations (Jimple, Baf, Shimple and Grimp). These representations are designed to make it easier to analyze programs written in Java. For example, Jimple maps Java code from its typical stack-based model to a three-registers-based model. You can also make modifications/optimizations to the code and try out new ideas for compiler and runtime optimizations. Alternatively you can "tag" instructions with metadata which can be helpful in building new development tools with powerful code visualization capabilities.</div>
<div style="text-align: justify;">
Soot also provides a set of APIs for performing <a href="http://www.sable.mcgill.ca/soot/tutorial/analysis/">data flow analysis</a>. These APIs can help you to code anything from live variable analysis to very busy expression analysis and more. And finally, Soot can also be invoked from the <a href="http://www.sable.mcgill.ca/soot/tutorial/usage/">command-line</a> without having to write any extension code.</div>
<div style="text-align: justify;">
So if you have any cool new ideas related to program analysis or optimization, grab the latest version of Soot. Whatever it is that you're trying to do, I'm sure Soot can help you implement it.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com1tag:blogger.com,1999:blog-4206392247746930256.post-3203386946420007532014-01-02T14:16:00.002-08:002014-01-02T14:21:24.141-08:00Calling WSO2 Admin Services in Python<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
I’m using some <a href="http://wso2.com/" style="text-align: justify;">WSO2</a> middleware for my ongoing research, and recently I had the requirement of calling some admin services from Python 2.7. All WSO2 products expose a number of special administrative web services (admin services), using which the WSO2 server instances can be controlled, configured and monitored. In fact, all the web-based UI components that ship with WSO2 middleware make use of these admin services under the hood to manage the server runtime.</div>
<div style="text-align: justify;">
WSO2 admin services are SOAP services (based on <a href="http://axis.apache.org/axis2/java/core/">Apache Axis2</a>), and are secured using HTTP basic authentication. All admin services expose a WSDL document using which client applications can be written or generated to consume the admin services. In this post I’m going to summarize how to implement a simple Python client to consume the WSO2 admin services.</div>
<div style="text-align: justify;">
We will be writing our Python client using the <a href="https://fedorahosted.org/suds/">Suds</a> SOAP library for Python. Suds is simple, lightweight and extremely easy to use. As the first step, we should install Suds. Depending on the Python package manager you wish to use, one of the following commands should do the trick (tested on OS X and Ubuntu):</div>
<pre>
sudo easy_install suds
sudo pip install suds
</pre>
<div style="text-align: justify;">
Next we need to instruct the target WSO2 server product to expose the admin service WSDLs. By default these WSDLs are hidden. To unhide them, open up the repository/conf/carbon.xml file of the WSO2 product, and set the value of HideAdminServiceWSDLs parameter to false:</div>
<pre><HideAdminServiceWSDLs>false</HideAdminServiceWSDLs></pre>
<div style="text-align: justify;">
Now restart the WSO2 server, and you should be able to access the admin service WSDLs using a web browser. For example, to access the WSDL of the UserAdmin service, point your browser to <a href="http://localhost:9443/services/UserAdmin?wsdl">http://localhost:9443/services/UserAdmin?wsdl</a></div>
<div style="text-align: justify;">
Now we can go ahead and write the Python code to consume any of the available admin services. Here’s a working sample that consumes the UserAdmin service. This simply prints out a list of roles defined in the WSO2 User Management component:</div>
<pre class="brush:python">from suds.client import Client
from suds.transport.http import HttpAuthenticated
import logging
if __name__ == '__main__':
#logging.basicConfig(level=logging.INFO)
#logging.getLogger('suds.client').setLevel(logging.DEBUG)
t = HttpAuthenticated(username='admin', password='admin')
client = Client('https://localhost:9443/services/UserAdmin?wsdl', location='https://localhost:9443/services/UserAdmin', transport=t)
print client.service.getAllRolesNames()
</pre>
<div style="text-align: justify;">
That’s pretty much it. I have tested this approach with several WSO2 admin services, and they all seem to work without any issues. If you need to debug something, uncomment the two commented out lines in the above example. That will print all the SOAP messages and the HTTP headers that are being exchanged.</div>
<div style="text-align: justify;">
I also tried to write a client using the popular SOAPy library, but unfortunately couldn’t get it to work due to several bugs in SOAPy. SOAPy was incapable of retrieving the admin service WSDLs over HTTPS. This can be worked around by using the HTTP URL for the WSDL, but in that case SOAPy failed to generate the correct request messages to call the admin services. Basically, the namespaces of the generated SOAP messages were messed up. But with Suds I didn’t run into any issues.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-14835464084181674072013-07-26T14:44:00.001-07:002013-07-26T14:44:56.482-07:00Avoiding the Risks of Cloud<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
<div style="text-align: justify;">
It's no secret that cloud computing has transformed the way enterprises do business. It has changed the way developers write software and users interact with applications. By now, almost every business organization has a strategy on how to adopt the cloud. Those who don’t will soon be extinct. The influence of the cloud has been so phenomenal, that it truly has turned into a "take it or die" kind of a deal over the last few years.</div>
<div style="text-align: justify;">
It is also no secret that today the cloud movement is steered by a handful of giants in the IT industry. Companies like Amazon, Google, Microsoft and Salesforce are clearly among this elite group. These companies, their products and vision have been instrumental in the introduction, evolution and the popularization of the cloud technology. </div>
<div style="text-align: justify;">
With that being the case, we must think about the implications of cloud computing on the current IT landscape of the world. Are all S&M organizations around the world going to get rid of their server racks and transfer their IT infrastructure to <a href="http://aws.amazon.com/ec2/">Amazon EC2</a>? Are all Web applications and mobile applications going to be based on <a href="https://developers.google.com/appengine/">Google App Engine</a> APIs? Are all enterprise data going to end up in <a href="http://aws.amazon.com/s3/">Amazon S3</a> and <a href="http://research.google.com/pubs/pub36971.html">Google Megastore</a>? What sort of defenses are in place to prevent a few IT giants from monopolizing the entire IT infrastructure and services market? How easy it would be for us to migrate from one cloud vendor to another? All these are indeed very real and very important problems that all organizations should take under careful consideration.</div>
<div style="text-align: justify;">
Fortunately there are several practical solutions to all the above issues. One is openness and standardization. Cloud platforms that are based on open standards and protocols should be preferred over those that use proprietary standards and protocols. Open standards and protocols are likely to be supported by more than just one cloud vendor thus enabling the users to migrate between different vendors easily. Also, in many cases open standards make it easier to port existing standalone applications to the cloud. Take a Java web application for an example. Most Java web applications are based on the J2EE suite of standards (JSP, Servlets, JDBC etc.). If the target cloud platform also supports these open standards, the user can easily migrate his J2EE app to the cloud without having to make too many changes. Similarly he can easily migrate the app from one cloud platform to another as long as both platforms support the same J2EE standards. </div>
<div style="text-align: justify;">
Speaking of openness, cloud platforms that are open source and distributed under liberal licenses should get extra credit over closed source ones. Open source cloud platforms allow the user to modify and shape the platform according to the user requirements, rather than forcing the user to change their apps according to the changes made by the cloud platform vendor. Also, with an open source cloud framework, users will be in a position to maintain and support the platform on their own, in a situation where the original vendor decides to discontinue support for the platform.</div>
<div style="text-align: justify;">
Another possible solution is to use a hybrid cloud approach instead of solely relying on a remote public cloud maintained by a third party vendor. A hybrid cloud approach typically involves a private cloud maintained by the user, and then selectively bursting into the public cloud to handle high availability and high scalability scenarios. This method does involve some additional expenses and legwork on the user's part but the user ultimately remains in control of his data and applications, and no third party vendor can take that away from the user. Also as far as most S&M organizations are concerned, what they expect from the cloud are features like multi-tenancy, self-provisioning, optimal resource utilization and auto-scaling. Spending a few bucks on running a server rack or two to make that happen is usually not a big deal. Most companies do that today anyway. However, from a technical standpoint, we need easy-to-deploy, easy-to-maintain and reliable private cloud frameworks, which are compatible with popular public cloud platforms to really take advantage of this hybrid cloud model. Fortunately, thanks to some excellent work by a few start-ups like <a href="http://www.eucalyptus.com/">Eucalyptus</a> and <a href="http://www.appscale.com/">AppScale</a>, this is no longer an issue. These vendors provide highly competitive private cloud and hybrid cloud solutions that are fully compatible with widely used public cloud platforms such as AWS and Google App Engine. If the user is capable of procuring the necessary hardware resources and manpower, these cloud platforms can even be used to setup fully-fledged private clouds that have all the bells and whistles of popular public clouds. That’s a great way to bask in the glory of the cloud, while maintaining full ownership and control over your enterprise IT assets.</div>
<div style="text-align: justify;">
Software frameworks like <a href="http://jclouds.incubator.apache.org/">Apache JClouds</a> provide another approach for dealing with potential risks of the cloud. These software frameworks allow user's code to interact with multiple heterogeneous cloud platforms by abstracting out the differences between various clouds. If we consider JClouds, as of now it supports close to 30 different cloud platforms including AWS, OpenStack and Rackspace. This implies that any application written using JClouds can be executed on around 30 different cloud platforms without having to make any code changes. As the influence of the cloud continues to grow, developers should seriously consider writing their code using high-level APIs like JClouds, without getting tied into a single specific cloud platform.</div>
<div style="text-align: justify;">
Cloud has certainly changed the way we all think about IT and computing. While its benefits are quite attractive, it also comes with a few potential risks. Users and developers should think carefully, plan ahead and take preventive action soon to avoid these pitfalls.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-17799280775039620052013-06-21T15:39:00.002-07:002013-06-21T15:39:39.748-07:00White House API Standards, DX and UX<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
The White House recently published some <a href="https://github.com/WhiteHouse/api-standards">standards for developing web APIs</a>. While going through the documentation, I came across a new term - DX. DX stands for developer experience. As anybody would understand, providing a good developer experience is the key to the success of a web API. Developers love to program with clean, intuitive APIs. On the other hand clunky, non-intuitive APIs are difficult to program with and usually are full of nasty surprises that make the developer's life hard. Therefore DX is perhaps the single most important factor when it comes to differentiating a good API from a not-so-good API.</div>
<div style="text-align: justify;">
The term DX reminds me of another similar term - UX. As you would guess UX stands for user experience. A few years ago UX was one of the most exciting topics in the IT industry. For a moment there everybody was talking and writing about UX and how websites and applications should be developed with UX best practices in mind. It seems with the rise of the web APIs, cloud and mobile apps, DX is starting to generate a similar buzz. In fact I think for a wide range of application development, PaaS, web and middleware products DX would be way more important than UX. <a href="http://redmonk.com/sogrady/">Stephen O'Grady</a> was so right. <a href="http://thenewkingmakers.com/">Developers are the new kingmakers</a>. </div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com1tag:blogger.com,1999:blog-4206392247746930256.post-69766622947317533822013-06-19T15:37:00.000-07:002013-06-19T15:37:08.892-07:00Is Subversion Going to Make a Come Back?<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
The <a href="http://apache.org/">Apache Software Foundation</a> (ASF) announced the release of <a href="http://subversion.apache.org/docs/release-notes/1.8.html">Subversion 1.8</a> yesterday. As I started to read the release note, I started wondering how come Subversion is still alive. The ASF heavily use Subversion for pretty much everything. In fact the source code of Subversion is also managed using a Subversion repository. But outside the ASF I've seen a strong push towards switching from Subversion to Git. Most startups and research groups that I know of have been using Git from day one. WSO2, the company I used to work for, is in the process of moving their code to Git. Being an Apache committer I obviously have to use Subversion regularly. But about a year ago I started using Git (<a href="https://github.com/">GitHub</a> to be exact) for my other development activities, and I absolutely adore it. It scales well for large code bases and large development teams, and it makes common tasks such as merging, reverting, reviewing other people's work and branching so much easier and intuitive. </div>
<div style="text-align: justify;">
But as it turns out Subversion is still the world's most widely used source version control system. As declared in the <a href="https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces43">official blog post</a> rolled out by the ASF yesterday, a number of tech giants including WordPress heavily use Subversion. According to <a href="http://www.ohloh.net/repositories/compare">Ohloh</a>, the percentage of open source projects that use Subversion is around 53%, compared to the 29% that use Git. Looks like Subversion has managed to capture quite a share of the market making it a very hard-to-kill technology. It would be interesting to see how the competition between Subversion and Git would unfold in the future. It seems the new release comes with a bunch of new features, which indicates that the project is very much alive and kicking and the Subversion community is not even close to giving up on the project.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-83527857022983270462013-06-14T12:03:00.000-07:002013-06-14T12:03:36.155-07:00More Reasons to Love Python - A Lesson on KISS<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
Recently I've been doing some work in the area of programming language design. At one point I wanted to define a Python subset which allows only the simplest Python statements without loops, conditionals, functions, classes and a bunch of other high-level constructs. So I looked into the grammar specification of the Python language and I was astonished by its simplicity and succinctness. Click <a href="http://docs.python.org/2/reference/grammar.html">here</a> to take a look for yourself. It's no longer than 125 lines of text, and the whole thing can be printed on one side of an A4 sheet. This is definitely one of those instances where the best design is also the simplest design. No wonder everybody loves Python.</div>
<div style="text-align: justify;">
However that's not the whole point. Having selected a suitable Python subset, I was looking into ways for implementing a simple parser for those grammar rules. I've done some work with <a href="https://javacc.java.net/">JavaCC</a> in the past, so I straightaway jumped into implementing a Java-based parser for the selected Python subset using JavaCC. After a few hours of coding I managed to get it working too. The next step of my project required me to do some analysis on the abstract syntax tree (AST) produced by the parser. I was looking around for some existing work that fits my requirements, and I came across Python's native <a href="http://docs.python.org/2.7/library/ast.html">ast </a>module. I immediately realized that all those hours I spent on implementing the JavaCC-based parser is a complete waste. The ast module provides excellent support for parsing Python code and constructing ASTs. This is all you have to do parse some Python code using the ast module and obtain an AST representation of the code.</div>
<pre class="brush:python">import ast
# The variable 'source' contains the Python statement to be parsed
source = 'x = y + z'
tree = ast.parse(source)
</pre>
<div style="text-align: justify;">
The ast module supports several modes. The default mode is exec which supports parsing a sequence of Python statements. The module also supports a special eval mode which can be used to parse simple one-liner Python statements. It turned out the eval mode supports more or less the same exact Python subset I wanted to use. So I threw away my JavaCC-based parser and wrote the following snippet of Python code to get my job done.</div>
<pre class="brush:python">import ast
# The variable 'source' contains the Python statement to be parsed
source = 'x = y + z'
tree = ast.parse(source, mode='eval')
</pre>
<div style="text-align: justify;">
Now when it came to analyzing the AST produced by the parser, the ast module again turned out to be useful. The module provides two helper classes, namely <a href="http://docs.python.org/2.7/library/ast.html#ast.NodeVisitor">NodeVisitor</a> and <a href="http://docs.python.org/2.7/library/ast.html#ast.NodeTransformer">NodeTransformer</a> which can be used to either traverse or transform a given Python AST. To use these helper classes, we just need to extend them and implement the appropriate visit methods. There's a unique top level visit method and one visit_ method per AST node type (e.g. visit_Str, visit_Num, visit_BoolOp etc.). Here's an example NodeVisitor implementation, that flattens a given Python AST into a list.</div>
<pre class="brush:python">class NodeEnumerator(ast.NodeVisitor):
def get_node_list(self, tree):
self.nodes = []
self.visit(tree)
return self.nodes
def visit(self, node):
self.generic_visit(node)
self.nodes.append(node)
</pre>
<div style="text-align: justify;">
These helper classes can be used to do virtually anything with a given AST. If you want you can even implement a Python interpreter in Python using this approach. In my case I'm running some search and isomorphism detection algorithms on the Python AST's.</div>
<div style="text-align: justify;">
So once again I've been pleasantly surprised and deeply impressed by the simplicity and richness of Python. It looks like the designers of Python have thought of everything. Kudos to Python aside, this whole experience taught me to always looks for existing, simple solutions before doing it in my own complicated way. It actually reminds me of the good old KISS principle - "Keep It Simple, Stupid". </div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com1tag:blogger.com,1999:blog-4206392247746930256.post-22884052297794039422013-04-05T17:11:00.000-07:002013-04-07T23:07:11.810-07:00MDCC - Strong Consistency with Performance <div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
A few weeks back me and a couple of my colleagues finished developing a complete implementation of the <a href="http://mdcc.cs.berkeley.edu/">MDCC</a> (Multi-Data Center Consistency) protocol. MDCC is a fast commit protocol proposed by UC Berkeley for large-scale geo-replicated databases. The main advantage of MDCC is that is supports strong consistency for data while providing transaction performance similar to eventually consistent systems. </div>
<div style="text-align: justify;">
With traditional distributed commit protocols, supporting strong consistency usually requires executing complex distributed consensus algorithms (e.g. <a href="http://research.microsoft.com/en-us/um/people/lamport/pubs/pubs.html#lamport-paxos">Paxos</a>). Such algorithms generally require multiple rounds of communication. Therefore when deployed in a multi-data center setting where the inter-data center latency is close to 100ms, the performance of the transactions being executed degrades to almost unacceptable levels. For this reason most replicated database systems and cloud data stores has opted to support a weaker notion of consistency. This greatly speeds up the transactions but you always run the risk of data becoming inconsistent or even lost.</div>
<div style="text-align: justify;">
MDCC employs a special variant of Paxos called <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=64624">Fast Paxos</a>. Fast Paxos takes a rather optimistic approach by which it is able to commit most transactions within a single network roundtrip. This way a data object update can be replicated to any number of data centers within a single request-response window. The protocol is also effectively masterless which means if the application is executing in a data center in Europe, it does not have to contact a special master server which could potentially reside in a data center in USA. The only time this protocol doesn't finish within a single request-response window is when two or more transactions attempt to update the same data object (transaction conflict). In that case a per-object master is elected and the Classic Paxos protocol is invoked to resolve the conflict. If the possibility of a conflict is small, MDCC will commit most transactions within a single network roundtrip thus greatly improving the transaction throughput and latency. </div>
<div style="text-align: justify;">
Unlike most replicated database systems, MDCC doesn't require explicit sharding of data into multiple segments. But it can be supported on MDCC if needed. Also unlike most cloud data stores, MDCC has excellent support for atomic multi-row (multi-object) transactions. That is multiple data objects can be updated atomically within a single read-write transaction. All these interesting properties make MDCC an excellent choice for implementing powerful database engines for modern day distributed and cloud computing environments.</div>
<div style="text-align: justify;">
Our implementation of MDCC is based on Java. We use <a href="http://thrift.apache.org/">Apache Thrift</a> as the communication framework between different components. <a href="http://zookeeper.apache.org/">ZooKeeper</a> is used for leader election purposes (we need to elect a per-object leader whenever there is a conflict). <a href="http://hbase.apache.org/">HBase</a> server is used as the storage engine. All the application data and metadata are stored in HBase. In order to reduce the number of storage accesses we also have a layer of in-memory caching. All the critical information and updates are written through to the underlying HBase server to maintain strong consistency. The cache still helps to avoid a large fraction of storage references. Our experiments show that most read operations are able to complete without ever going to HBase layer. </div>
<div style="text-align: justify;">
We provide a simple and intuitive API in our MDCC implementation so that users can write their own applications using our MDCC engine. A simple transaction implementing using this API would look like this.</div>
<pre class="brush:java">
TransactionFactory factory = new TransactionFactory();
Transaction txn = factory.create();
try {
txn.begin();
byte[] foo = txn.read("foo");
txn.write("bar", "bar".getBytes());
txn.commit();
} catch (TransactionException e){
reportError(e);
} finally {
factory.close();
}
</pre>
<div style="text-align: justify;">
We also did some basic performance tests on our MDCC implementation using the <a href="http://research.yahoo.com/Web_Information_Management/YCSB">YCSB</a> benchmark. We used 5 EC2 micro instances distributed across 3 data centers (regions) and deployed a simple 2-shard MDCC cluster. Each shard consisted of 5 MDCC storage nodes (amounting to a total of 10 MDCC storage nodes). We ran several different types of workloads on this cluster and in general succeeded in achieving < 1ms latency for read operations and < 100ms latency for write operations. Our implementation performs best with mostly-read workloads, but even with a fairly large number of conflicts, the system delivers reasonable performance. </div>
<div style="text-align: justify;">
Our system ensures correct and consistent transaction semantics. We have excellent support for atomic multi-row transactions, concurrent transactions and even some rudimentary support for crash recovery. If you are interested to give this implementation a try, grab the source code from <a href="https://github.com/hiranya911/mdcc">https://github.com/hiranya911/mdcc</a>. Use Maven3 to build the distribution, extract and run.</div>
</div>Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-89031155599421016852013-03-11T18:58:00.001-07:002013-03-11T18:59:52.931-07:00Starting HBase Server Programmatically<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
I'm implementing a database application these days and for that I wanted to programmatically start and stop a standalone <a href="http://hbase.apache.org/">HBase</a> server. More specifically I wanted to make HBase server a part of my application so that whenever my application starts, HBase server also starts up. This turned out to be more difficult than I thought it would be. To start a HBase server you actually need to start three things:</div>
<div style="text-align: justify;">
1. HBase master server</div>
<div style="text-align: justify;">
2. HBase region server</div>
<div style="text-align: justify;">
3. ZooKeeper</div>
<div style="text-align: justify;">
The default startup script shipped with the HBase binary distribution does all this for you. But I wanted a more tightly integrated and a fully programmatic solution. Unfortunately the HBase public API doesn't seem to expose the functionality required for programmatically starting and stopping the above components (at least not in a straightforward manner). So after going through the HBase source and trying out various things, I managed to come up with some code that does exactly what I want. At a high level, this is what my code does:</div>
<div style="text-align: justify;">
1. Create an instance of <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/zookeeper/HQuorumPeer.html">HQuorumPeer</a> and execute it on a separate thread.</div>
<div style="text-align: justify;">
2. Create an initialize a <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration">HBaseConfiguration</a> instance.</div>
<div style="text-align: justify;">
3. Create an instance of <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/master/HMaster.html">HMaster</a> and execute it on a separate thread.</div>
<div style="text-align: justify;">
4. Create an instance of <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/regionserver/HRegionServer.html">HRegionServer</a> and execute it on a separate thread.</div>
<div style="text-align: justify;">
Both HMaster and HRegionServer implement the Runnable interface. Therefore it's easy to run them on separate threads. I created a simple Java Executor instance and scheduled HMaster and HRegionServer for execution on it. But HQuorumPeer was a bit tricky. This class only contains a main method and has no such thing called a public API. So one solution is to create your own thread class, which simply invokes the above mentioned main method. The other option is to write your own HQuorumPeer class implementing the Runnable interface. The original HQuorumPeer class from the HBase project is fairly small and contains only a small amount of code. So I took the second approach. I simply copied the code from the original HQuorumPeer and created my own HQuorumPeer implementing the Runnable interface. Overall this is what my finalized code looks like:</div>
<pre class="brush:java">
exec.submit(new HQuorumPeer(properties));
log.info("HBase ZooKeeper server started");
Configuration config = HBaseConfiguration.create();
File hbaseDir = new File(hbasePath, "data");
config.set(HConstants.HBASE_DIR, hbaseDir.getAbsolutePath());
for (String key : properties.stringPropertyNames()) {
if (key.startsWith("hbase.")) {
config.set(key, properties.getProperty(key));
} else {
String name = HConstants.ZK_CFG_PROPERTY_PREFIX + key;
config.set(name, properties.getProperty(key));
}
}
try {
master = new HMaster(config);
regionServer = new HRegionServer(config);
masterFuture = exec.submit(master);
regionServerFuture = exec.submit(regionServer);
log.info("HBase server is up and running...");
} catch (Exception e) {
handleException("Error while initializing HBase server", e);
}
</pre>
<div style="text-align: justify;">
Then I nicely wrapped up all this logic into a single reusable util class called <a href="https://github.com/hiranya911/mdcc/blob/master/core/src/main/java/edu/ucsb/cs/mdcc/util/HBaseServer.java">HBaseServer</a>. So whenever I want to start/stop HBase in my application, this is all I have to do.</div>
<pre class="brush:java">HBaseServer hbaseServer = new HBaseServer();
hbaseServer.start();
</pre>
<div style="text-align: justify;">
Hope somebody finds this useful :)</div>
</div>Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-90808435860837582832013-02-05T11:42:00.002-08:002013-02-05T11:42:25.906-08:00How the World's Fastest ESB was Made<div dir="ltr" style="text-align: left;" trbidi="on">
<span style="text-align: justify;">A couple of years ago, at </span><a href="http://wso2.com/" style="text-align: justify;">WSO2</a><span style="text-align: justify;"> we implemented a new HTTP transport for </span><a href="http://wso2.com/products/enterprise-service-bus/" style="text-align: justify;">WSO2 ESB</a><span style="text-align: justify;">. Requirements for this new transport can be summarized as follows:</span><br />
<ol style="text-align: left;">
<li style="text-align: justify;">Ultra-fast, low latency mediation of HTTP requests.</li>
<li style="text-align: justify;">Supporting a very large number of inbound (client-ESB) and outbound (ESB-server) connections concurrently (we were looking at several thousand concurrent connections).</li>
<li style="text-align: justify;">Automatic throttling and graceful performance degradation in the presence of slow or faulty clients and servers.</li>
</ol>
<div style="text-align: justify;">
The default non-blocking HTTP (NHTTP) transport from <a href="http://synapse.apache.org/">Apache Synapse</a>, which we were also using in WSO2 ESB, supported the above requirements up to a certain extent but we wanted to do better. The default transport was very generic and it was designed to offer reasonable performance in all the integration scenarios the ESB could potentially participate in. However HTTP load balancing, HTTP URL routing (URL rewriting) and HTTP header-based routing are some of the most widely used integration patterns in the industry and to support these use cases well, we needed a specialized transport. </div>
<div style="text-align: justify;">
The old NHTTP transport was based on a dual buffer model. Incoming message content was placed in a <a href="http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/apidocs/org/apache/http/nio/util/SharedInputBuffer.html">SharedInputBuffer</a> and the outgoing message content was placed in a <a href="http://hc.apache.org/httpcomponents-core-ga/httpcore-nio/apidocs/org/apache/http/nio/util/SharedOutputBuffer.html">SharedOutputBuffer</a>. <a href="http://ws.apache.org/axiom/">Apache Axiom</a>, <a href="http://axis.apache.org/axis2/java/core/">Apache Axis2</a> and the Synapse mediation engine sit between the two buffers, reading from the input buffer and writing to the output buffer. This architecture is illustrated in the following diagram.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgtenb1CGggEaHFpLelyFgqXD0HNBoAzbw_W5S43KVsSBKkZDHUVf7WgHvYUAWhdNi9Wu9W2P4DFD6n4qiViBNk8Ce6Wt3GAkwig2xYwjZcWbu739ys5xArqiYlcNnfPD752ZgQMcPtuGXk/s1600/nhttp.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="92" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgtenb1CGggEaHFpLelyFgqXD0HNBoAzbw_W5S43KVsSBKkZDHUVf7WgHvYUAWhdNi9Wu9W2P4DFD6n4qiViBNk8Ce6Wt3GAkwig2xYwjZcWbu739ys5xArqiYlcNnfPD752ZgQMcPtuGXk/s640/nhttp.png" width="640" /></a></div>
<div style="text-align: justify;">
The key advantage of this architecture is that it enables the ESB (mediators) to intercept all the messages and manipulate them in any way necessary. The main downside is every message happens to go through the Axiom layer, which is not really necessary in cases like HTTP load balancing and HTTP header-based routing. Also the overhead of moving data from one buffer to another was not always justifiable in this model. So when we started working on the new HTTP transport we wanted to get rid of these limitations. We knew that this might result in a not-so-generic HTTP transport, but we were willing to pay that price at the time.</div>
<div style="text-align: justify;">
So after some very interesting brainstorming sessions, an exciting 1-week long hackathon followed by several months of testing, bug-fixing and refactoring we came up with what’s today known as the HTTP pass-through transport. This transport was based on a single buffer model and completely bypassed the Axiom layer. The resulting architecture is illustrated below.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4_3I0VkMR7h_s9JZ1Cw_A7z17_TdIQRfkfGHqnWchWBJt1GR2t1D_LCC5GL_st5rkNldzzBTlXxNgpWqWC6SPTC44BV4KtRgxjil7VM6VqMs-ZugcTQ5gIHew6XquDccZESSva6n3wGlt/s1600/pass.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="112" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4_3I0VkMR7h_s9JZ1Cw_A7z17_TdIQRfkfGHqnWchWBJt1GR2t1D_LCC5GL_st5rkNldzzBTlXxNgpWqWC6SPTC44BV4KtRgxjil7VM6VqMs-ZugcTQ5gIHew6XquDccZESSva6n3wGlt/s320/pass.png" width="320" /></a></div>
<div style="text-align: justify;">
The HTTP pass-through transport was first released in June 2011 along with WSO2 ESB 4.0. Back then it was disabled by default and the user had to enable it by uncommenting a few entries in the axis2.xml file. The performance numbers we were seeing with the new transport were simply remarkable. WSO2 also published some of these benchmarking results in a <a href="http://wso2.org/library/articles/2012/03/wso2-esb-message-transfer-mechanisms-comparative-benchmarks">March 2012 article</a>. However at this point the 2 main limitations in the new transport were starting to give us headaches.</div>
<ol style="text-align: left;">
<li style="text-align: justify;">Configuration overhead (Users had to explicitly enable the transport depending on their target use cases)</li>
<li style="text-align: justify;">Cannot support any integration scenario that requires HTTP content manipulation (because Axiom was bypassed, any mediator attempting to access the message payload would not get anything useful to work with)</li>
</ol>
<div style="text-align: justify;">
In addition to these technical issues there were other process related issues that we had to deal with. For instance maintaining two separate HTTP transports was twice as work for the developers and testers. We found that because the pass-through transport was not used as the default, it often lagged behind the default NHTTP transport in terms of features and stability. So after a few brainstorming sessions we decided to try and make the pass-through transport the default HTTP transport in Apache Synapse/WSO2 ESB. But this required making the content manipulation use cases (content aware use cases) work with the new transport. This implied bringing Axiom back into the picture, the very thing we wanted to avoid in our initial implementation. So in order to balance out our performance and heterogeneous integration requirements we came up with the idea of “on-demand message parsing in the mediation engine”.</div>
<div style="text-align: justify;">
In this new model, each mediator instance belongs to one of two classes.</div>
<ol style="text-align: left;">
<li style="text-align: justify;">Content-unaware mediators – Mediators that never access the message content in anyway (eg: drop mediator)</li>
<li style="text-align: justify;">Content-aware mediators – Mediators that always access the message content (eg: xslt mediator)</li>
</ol>
<div style="text-align: justify;">
We also identified a third class known as conditionally content-aware mediators. These mediators could be either content-aware or content-unaware depending on their exact instance configuration. For an example a simple log mediator instance, configured as <log/> is content-unaware. However a log mediator configured as <log level=”full”/> would be content-aware since it’s expected to log the message payload. Similarly a simple property mediator instance such as <property name=”foo” value=”bar”/> is content-unaware but <property name=”foo” expression=”/some/xpath”/> could be content-aware depending on what the XPath expression does. In order to capture this content-awareness characteristic of mediator instances at runtime, we introduced a new method (isContentAware) to the top level Mediator interface of Synapse. The default implementation in AbstractMediator class returns true by default so as to maintain backward compatibility. </div>
<div style="text-align: justify;">
With this change in place we modified the mediation engine to check the content-awareness of property of each mediator at runtime before submitting a message to it. List mediators such as the SequenceMediator would run the check recursively on its child mediators to obtain the final value. Assuming that messages are always received through the pass-through HTTP transport, the mediation engine would invoke a special message parsing routine whenever a mediator is detected to be content-aware. It is in this special routine that we bring Axiom into the picture. Therefore if none of the mediators in a given flow or a service is content-aware, the pass-through transport works as it usually does without ever engaging Axiom. But whenever a content-aware mediator is involved, we bring Axiom in. This way we can reap the performance benefits of the pass-through transport while supporting all integration scenarios of the ESB. Since we engage Axiom on-demand we get the best possible outcome for all scenarios. For instance a simple pass through proxy would always work without any Axiom interactions. An XSLT proxy that transforms requests would engage Axiom only in the request flow. Response flow would operate without parsing the messages.</div>
<div style="text-align: justify;">
Another tricky problem we encountered was dealing with message parsing itself. For instance how do we parse a message and then send it out when there is only one buffer provided by the underlying pass-through transport? Ideally we need two buffers to read the incoming message from and write the outgoing message to. Also the fact that the Axis2 message builder framework can only handle streams posed a few problems. The buffer we maintained in the pass-through transport was a Java NIO <a href="http://docs.oracle.com/javase/1.5.0/docs/api/java/nio/ByteBuffer.html">ByteBuffer</a> instance. So we needed to adapt the buffer into a stream implementation whenever the mediation engine engages Axiom. We solved the first problem by implementing our message builder routine to create a second output buffer whenever Axiom is dragged into the picture. The outgoing messages are serialized into this second buffer and the pass-through transport was modified to pick the outgoing content from the second buffer when it’s available. Writing an InputStream implementation that can wrap a ByteBuffer instance solved the second problem.</div>
<div style="text-align: justify;">
One last problem that needed to be solved was handling security. In Synapse/WSO2 ESB, security is handled by <a href="http://axis.apache.org/axis2/java/rampart/">Apache Rampart</a>, which runs as an Axis2 module that intercepts the messages before they hit the mediation engine. So on-demand parsing at the mediation engine doesn’t work in this scenario. We need to parse the messages before Rampart intercepts them. We solved this issue by introducing a new smart handler to the Axis2 handler chain, which intercepts every message and performs an early parse if security is engaged on the flow. The same solution can be extended to support other modules that require parsing message payload in the Axis2 handler chain.</div>
<div style="text-align: justify;">
The reason I decided to compile this blog is because WSO2 folks just released WSO2 ESB 4.6. And this release is based on the new model I’ve described here. Pass-through transport is what the users now get by default. The WSO2 team has also published some <a href="http://wso2.org/library/articles/2013/01/esb-performance-65">performance figure</a>s that clearly indicate what the new design is capable of. It turns out the <b>latest release of WSO2 ESB outperforms all the major open source ESB vendors by a significant margin</b>. This release also comes with a new XSLT mediator (Fast XSLT) that operates on the top of the pass-through model of the underlying transport and a new <a href="http://wso2.org/library/articles/2013/01/streaming-xpath-parser-wso2-esb">streaming XPath implementation</a> based on Antlr.</div>
<div style="text-align: justify;">
The next step of this effort would be to get these improvements integrated into the Apache Synapse code base. This work is already underway and you can monitor its progress through <a href="https://issues.apache.org/jira/browse/SYNAPSE-913">SYNAPSE-913</a> and <a href="https://issues.apache.org/jira/browse/SYNAPSE-920">SYNAPSE-920</a>.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com10tag:blogger.com,1999:blog-4206392247746930256.post-8584579614782255322013-01-28T15:01:00.001-08:002013-01-28T15:04:43.442-08:00Introducing AppsCake: Makes Deploying AppScale a Piece of Cake<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
One of my very first contributions to <a href="http://appscale.cs.ucsb.edu/" style="text-align: justify;">AppScale</a> was a component named <a href="https://github.com/AppScale/appscake" style="text-align: justify;">AppsCake</a>. AppsCake is a dynamic web component, which provides a web frontend for the command-line AppScale Tools. It enables the users to deploy and start AppScale over several different types of infrastructure. This greatly reduces the overhead of starting and managing a PaaS as most of the heavy lifting operations can be performed easily by a click of a button. Users do not need to learn the AppScale Tools commands nor they have to be familiar with any command-line interface. With AppsCake, a regular web browser is all you need to initialize AppScale and start deploying applications in the cloud.</div>
<div style="text-align: justify;">
As of now AppsCake supports deploying AppScale over virtualized clusters (eg: <a href="http://xen.org/">Xen</a>), <a href="http://aws.amazon.com/ec2/">Amazon EC2 </a>and <a href="http://www.eucalyptus.com/">Eucalyptus</a>. Users can select the environment in which AppScale should be deployed and provide the required credentials and other metadata for the target environment through the web interface. AppsCake takes care of invoking the proper command sequences with the appropriate arguments to initialize AppScale. The web frontend also allows the users to view deployment logs and monitor the deployment progress in near real-time. </div>
<div style="text-align: justify;">
This component can be further extended and be offered as a service of its own if needed. That way, users can access AppsCake through a well-known URL and setup an AppScale deployment remotely for the purpose of executing a specific task or an application. As an example consider a group of scientists who want to run various scientific computations in the cloud (say as MPI or MapReduce jobs). The group can use a private Eucalyptus cluster or a shared EC2 account as their computing infrastructure. The group can be provided with a single well-known AppsCake instance as the entry point for AppScale. Then whenever a member of the team wants to run a computation on the target shared environment, he or she can use the AppsCake service to initiate his or her own AppScale instance and run the required computation in the cloud. This scheme maximizes resource sharing while providing sufficient isolation between applications/jobs initiated by individual users.</div>
<div style="text-align: justify;">
AppsCake is implemented using Ruby and <a href="http://www.sinatrarb.com/">Sinatra</a>. To try this out, simply checkout the source from <a href="https://github.com/AppScale/appscake">Github</a>, and execute the bin/debian_setup.sh script (build script only supports Debian/Ubuntu environments as of now). Then execute bin/appscake to start the AppsCake web service. Now you can point your browser to https://localhost:28443 and start interacting with the service.</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7li_amEvbFHMEq8Uv5xFFeM1CYlF1Qkrjg2x-Z6ZOTQ_rZwIndzb2C-5V2MZdLQogC6ziAX1oZ4eIfKcBo6rZXCVX0TW8Ziif1QK5MzlADtuZLLIpD0iMwGitW_2eRsHwyRoVl2ekokvO/s1600/ac1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7li_amEvbFHMEq8Uv5xFFeM1CYlF1Qkrjg2x-Z6ZOTQ_rZwIndzb2C-5V2MZdLQogC6ziAX1oZ4eIfKcBo6rZXCVX0TW8Ziif1QK5MzlADtuZLLIpD0iMwGitW_2eRsHwyRoVl2ekokvO/s320/ac1.png" width="320" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHvHiinHHsTvWJQpCulplqq3dySikdX3zjfRCNmh9RuYsi36q6GB-IjyfgIToxcdEYXv9IRh0B_i7Ksbb5_6grSRJ-52eeYfa3nq65HXcl7Rrbyj66RIVzeHyM4Pkctuw5wESJuR3TpR94/s1600/ac2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgHvHiinHHsTvWJQpCulplqq3dySikdX3zjfRCNmh9RuYsi36q6GB-IjyfgIToxcdEYXv9IRh0B_i7Ksbb5_6grSRJ-52eeYfa3nq65HXcl7Rrbyj66RIVzeHyM4Pkctuw5wESJuR3TpR94/s320/ac2.png" width="320" /></a></div>
<div style="text-align: justify;">
<a href="http://www.cs.ucsb.edu/~cgb/">Chris</a> has posted a neat little screencast that explains how to use AppsCake to deploy AppScale on Virtual Box. Don’t forget to check that out too.</div>
<div style="text-align: justify;">
<iframe allowfullscreen="" frameborder="0" height="315" src="http://www.youtube.com/embed/aeDgNG2Fn8I" width="560"></iframe>
</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-54968301113745349822013-01-20T13:36:00.000-08:002013-01-20T13:47:15.758-08:00On Premise API Management for Services in the Cloud<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
In some of my recent posts I explained how to install and start <a href="http://appscale.cs.ucsb.edu/" style="text-align: justify;">AppScale</a>. I showed how to use AppScale command-line tools to manage an AppScale PaaS on virtualized environments such as <a href="http://www.xen.org/" style="text-align: justify;">Xen</a> and IaaS environments such as <a href="http://aws.amazon.com/ec2/" style="text-align: justify;">EC2</a> and <a href="http://www.eucalyptus.com/" style="text-align: justify;">Eucalyptus</a>. Then we also looked at how to deploy <a href="https://developers.google.com/appengine/" style="text-align: justify;">Google App Engine</a> (GAE) apps over AppScale. In this post we are going to try something different.</div>
<div style="text-align: justify;">
Here I’m going to describe a possible hybrid architecture for deploying RESTful services in the cloud and exposing those services through an on-premise API management platform. This type of an architecture is most suitable for B2B integration scenarios where one organization provides a range of services and several other organizations consume them with their own custom use cases and SLAs. Both service providers and service consumers can greatly benefit from the proposed hybrid architecture. It enables the API providers to reap the benefits of the cloud with reduced deployment cost, reduced long-term maintenance overhead and reduced time-to-market. API consumers can use their own on-premise API management platform as a local proxy, which provides powerful access control, rate control, analytics and community features on top of the services already deployed in the cloud. </div>
<div style="text-align: justify;">
To try this out, first spin up an AppScale PaaS in a desired cloud environment. You can refer my previous posts or go through the AppScale wiki to learn how to do this. Then we can deploy a simple RESTful web service in our AppScale cloud. Here I’m posting the source code for a simple web service called “starbucks” written in Python using the GAE APIs. The “starbucks” service can be used to submit and manage simple drink orders. It uses the GAE <a href="https://developers.google.com/appengine/docs/python/datastore/">datastore API</a> to store all the application data and exposes all the fundamental CRUD operations as REST calls (Creare = POST, Update = PUT, Read = GET, Delete = DELETE).</div>
<pre class="brush:python">try:
import json
except ImportError:
import simplejson as json
import random
import uuid
from google.appengine.ext import db, webapp
from google.appengine.ext.webapp.util import run_wsgi_app
PRICE_CHART = {}
class Order(db.Model):
order_id = db.StringProperty(required=True)
drink = db.StringProperty(required=True)
additions = db.StringListProperty()
cost = db.FloatProperty()
def get_price(order):
if PRICE_CHART.has_key(order.drink):
price = PRICE_CHART[order.drink]
else:
price = random.randint(2, 6) - 0.01
PRICE_CHART[order.drink] = price
if order.additions is not None:
price += 0.50 * len(order.additions)
return price
def send_json_response(response, payload, status=200):
response.headers['Content-Type'] = 'application/json'
response.set_status(status)
if isinstance(payload, Order):
payload = {
'id' : payload.order_id,
'drink' : payload.drink,
'cost' : payload.cost,
'additions' : payload.additions
}
response.out.write(json.dumps(payload))
class OrderSubmissionHandler(webapp.RequestHandler):
def post(self):
order_info = json.loads(self.request.body)
order_id = str(uuid.uuid1())
drink = order_info['drink']
order = Order(order_id=order_id, drink=drink, key_name=order_id)
if order_info.has_key('additions'):
additions = order_info['additions']
if isinstance(additions, list):
order.additions = additions
else:
order.additions = [ additions ]
else:
order.additions = []
order.cost = get_price(order)
order.put()
self.response.headers['Location'] = self.request.url + '/' + order_id
send_json_response(self.response, order, 201)
class OrderManagementHandler(webapp.RequestHandler):
def get(self, order_id):
order = Order.get_by_key_name(order_id)
if order is not None:
send_json_response(self.response, order)
else:
self.send_order_not_found(order_id)
def put(self, order_id):
order = Order.get_by_key_name(order_id)
if order is not None:
order_info = json.loads(self.request.body)
drink = order_info['drink']
order.drink = drink
if order_info.has_key('additions'):
additions = order_info['additions']
if isinstance(additions, list):
order.additions = additions
else:
order.additions = [ additions ]
else:
order.additions = []
order.cost = get_price(order)
order.put()
send_json_response(self.response, order)
else:
self.send_order_not_found(order_id)
def delete(self, order_id):
order = Order.get_by_key_name(order_id)
if order is not None:
order.delete()
send_json_response(self.response, order)
else:
self.send_order_not_found(order_id)
def send_order_not_found(self, order_id):
info = {
'error' : 'Not Found',
'message' : 'No order exists by the ID: %s' % order_id,
}
send_json_response(self.response, info, 404)
app = webapp.WSGIApplication([
('/order', OrderSubmissionHandler),
('/order/(.*)', OrderManagementHandler)
], debug=True)
if __name__ == '__main__':
run_wsgi_app(app)
</pre>
<div style="text-align: justify;">
Before we go any further let’s take a few seconds and appreciate how simple and concise this piece of code is. With just about 100 lines of Python code we have developed a comprehensive webapp, which uses JSON as the data exchange format and also does database access and provides decent error handling. Imagine doing the same thing in a language like Java in a traditional servlet container environment. We will have to write lot more code and also bundle a ridiculous amount of additional dependencies to parse and construct JSON and perform database queries. But as seen here, GAE APIs make it absolutely trivial to develop powerful web APIs for the cloud with a minimum amount of code.</div>
<div style="text-align: justify;">
You can download the complete “starbucks” application from <a href="http://people.apache.org/~hiranya/starbucks.tar.gz">here</a>. Simply extract the downloaded tar ball and you’re good to go. The webapp consists of just 2 files. The main.py contains all the source code of the app and app.yaml is the GAE webpp descriptor. No additional libraries or files are needed to make this work. Use AppScale-Tools to deploy the app in your AppScale cloud.</div>
<pre>appscale-upload-app –-file /path/to/starbucks --keyname my_key_name</pre>
<div style="text-align: justify;">
To try out the app, put the following JSON string into a file named order.json:</div>
<pre>{
"drink" : "Caramel Frapaccino",
"additions" : [ "Whip Cream" ]
}</pre>
<div style="text-align: justify;">
Now execute the following Curl request on your App:</div>
<pre>curl –v –d @order.json –H “Content-type: application/json” http://host:port/order</pre>
<div style="text-align: justify;">
Replace 'host'<host> and 'port' <port> with the appropriate values for your AppScale PaaS. This request should return a HTTP 201 Created response with a Location header.</port></host></div>
<div style="text-align: justify;">
And now for the API management part. For this I’m going to use the open source API management solution from WSO2, a project that I was a part of a while ago. Download the latest <a href="http://wso2.com/products/api-manager/">WSO2 API Manager</a> and install it on your local computer by extracting the zip archive. Go into the bin directory and execute wso2server.sh (or wso2server.bat for Windows) to start the API Manager. You need to have JDK 1.6 or higher installed to be able to do this.</div>
<div style="text-align: justify;">
Once the server is up and running, navigate to http://localhost:9763/publisher and sign in to the console using “admin” as both the username and the password. Go ahead and create an API for our “starbucks” service in the cloud. You can use http://host<host>:port<port> as the service URL where 'host' and 'port' should point to the AppScale PaaS. API creation process should be pretty straightforward. If you need any help, you can refer my past blog posts on WSO2 API Manager or go through the <a href="http://docs.wso2.org/wiki/display/AM130/WSO2+API+Manager+Documentation">WSO2 documentation</a>. Once the API is created and published, head over to the API Store at http://localhost:9763/store.</port></host><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiv-jMUBbYlwTi-Lh3SeiNMdLPHPpOOCJZSlJIXBrXpOUX_9lj2GhrMLFtyXENqXTOTBfJc1Jwxvuq9OEm-U-16pb_Pwl3fSmPvW7HMZD5mcC3nsY5LU4uYHh55Km4rAP1uttRge5I_gd_F/s1600/store.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiv-jMUBbYlwTi-Lh3SeiNMdLPHPpOOCJZSlJIXBrXpOUX_9lj2GhrMLFtyXENqXTOTBfJc1Jwxvuq9OEm-U-16pb_Pwl3fSmPvW7HMZD5mcC3nsY5LU4uYHh55Km4rAP1uttRge5I_gd_F/s320/store.png" width="320" /></a></div>
Now you can sign up at the API Store as an API consumer, generate an API key for the Starbucks API and start using it.</div>
<div style="text-align: justify;">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwqCr74VgIjduPvlmCN99SWUZdxUlHdCUGOHTQQzQVxDDhqaC-9DA681ThYbV2F1WQ54WzpMhiSBW1ffRLP3dmAXozVfwYE03lvoV30cJHqfVTBe9KxospPjByMRXJjUMAGe9f2MW2OL1e/s1600/subscriptions.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiwqCr74VgIjduPvlmCN99SWUZdxUlHdCUGOHTQQzQVxDDhqaC-9DA681ThYbV2F1WQ54WzpMhiSBW1ffRLP3dmAXozVfwYE03lvoV30cJHqfVTBe9KxospPjByMRXJjUMAGe9f2MW2OL1e/s320/subscriptions.png" width="320" /></a></div>
Submit Order:</div>
<div style="text-align: justify;">
<pre>curl –v –d @order.json –H “Content-type: application/json” –H “Authorization: Bearer api_key” http://localhost:8280/starbucks/1.0.0/order</pre>
</div>
<div style="text-align: justify;">
Review Order:</div>
<pre>curl –v –H “Authorization: Bearer api_key” http://localhost:8280/starbucks/1.0.0/order/order_id</pre>
<div style="text-align: justify;">
Delete Order:</div>
<div style="text-align: justify;">
<pre>curl –v –X DELETE –H “Authorization: Bearer api_key” http://localhost:8280/starbucks/1.0.0/order/order_id</pre>
</div>
<div style="text-align: justify;">
Replace 'api_key' with the API key generated by the API Store. Replace the 'order_id' with the unique identifier sent in the response for the submit order request.</div>
<div style="text-align: justify;">
There you have it. On-premise API management for services in the cloud. This looks pretty simple at first glimpse, but actually this is a quite powerful architecture. Note that all the critical components (service runtime, registry and consumer) are very well separated from each other, which allows maximum flexibility. The portions in the cloud can benefit from cloud specific features such as autoscaling to deliver the maximum throughput with optimal resource utilization. Since the API management platform is being controlled by individual consumer organizations, they can easily enforce their own custom policies, SLAs and optimize for their common access patterns.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-824169604305404292013-01-09T18:20:00.001-08:002013-01-09T18:20:17.934-08:00How to Get Your Third Party APIs to Shutup?<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
When programming with 3rd party libraries, sometimes we need to suppress or redirect the standard output generated by the 3rd party libraries. A very common scenario is that a third party library we use in an application generates a very verbose output which clutters up the output of our program. With most programming languages we can write a simple suppress/redirect procedure to fix this problem. Such functions are sometimes colloquially known as STFU functions. Here I'm describing a couple of STFU functions I implemented in some of my recent work.</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<u>1. <a href="https://github.com/AppScale/appscake">AppsCake</a> (Web interface for AppScale-Tools)</u></div>
<div style="text-align: justify;">
This is a Ruby based dynamic web component which uses some of the core AppScale-Tools libraries. For this project I wanted to capture the standard output of the AppScale-Tools libraries and display it on a web page. As the first step I wanted to redirect the standard output of AppScale-Tools to a separate text file. Here's what I did.</div>
<pre class="brush: ruby">def redirect_standard_io(timestamp)
begin
orig_stderr = $stderr.clone
orig_stdout = $stdout.clone
log_path = File.join(File.expand_path(File.dirname(__FILE__)), "..", "logs")
$stderr.reopen File.new(File.join(log_path, "deploy-#{timestamp}.log"), "w")
$stderr.sync = true
$stdout.reopen File.new(File.join(log_path, "deploy-#{timestamp}.log"), "w")
$stdout.sync = true
retval = yield
rescue Exception => e
puts "[__ERROR__] Runtime error in deployment process: #{e.message}"
$stdout.reopen orig_stdout
$stderr.reopen orig_stderr
raise e
ensure
$stdout.reopen orig_stdout
$stderr.reopen orig_stderr
end
retval
end</pre>
<div style="text-align: justify;">
Now whenever I want to redirect the standard output and invoke the AppScale-Tools API I can do this.</div>
<pre class="brush: ruby">redirect_standard_io(timestamp) do
# Call AppScale-Tools API
end
</pre>
<div style="text-align: justify;">
<u>2. <a href="https://github.com/AppScale/hawkeye.git">Hawkeye</a> (API fidelity test suite for AppScale)</u></div>
<div style="text-align: justify;">
This is a Python based framework which makes a lot of RESTful invocations using the standard Python <a href="http://docs.python.org/2/library/httplib.html">httplib</a> API. I wanted to trace the HTTP requests and responses that are being exchanged during the execution of the framework and log them to a separate log file. Python httplib has a verbose mode which can be enabled by passing a special flag to the HTTPConnection class and it turns out this mode logs almost all the information I need. But unfortunately it logs all this information to the standard output of the program thus messing up the output I wanted to present to users. Therefore I needed a way to redirect the standard output for all httplib API calls. Here's how that problem was solved.</div>
<pre class="brush: python">http_log = open('logs/http.log', 'a')
original = sys.stdout
sys.stdout = http_log
try:
# Invoke httplib
finally:
sys.stdout = original
http_log.close()</pre>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-29346090677197939492013-01-08T19:28:00.001-08:002013-01-08T19:29:00.652-08:00Evolution of Networked Computing<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
</div>
<ul>
<li>1969: <i>As of now, computer networks are still in their infancy, but as they grow up and become sophisticated, we will probably see the spread of 'computer utilities', which like present electric and telephone utilities, will service individual homes and offices across the country.</i> - <a href="http://www.lk.cs.ucla.edu/index.html">Leonard Kleinrock,</a> UCLA</li>
<li>1984: <i>The network is the computer.</i> - <a href="http://en.wikipedia.org/wiki/John_Gage">John Gage</a>, Sun Microsystems</li>
<li>2008: <i>The data center is the computer.</i> - <a href="http://www.cs.berkeley.edu/~pattrsn/">David Patterson</a>, UC Berkeley</li>
<li>2008: <i>Cloud is the computer.</i> - <a href="http://www.buyya.com/">Rajkumar Buyya</a>, Melbourne University</li>
</ul>
<div style="text-align: justify;">
From the book "<a href="http://www.amazon.com/Distributed-Cloud-Computing-Parallel-Processing/dp/0123858801">Distributed and Cloud Computing</a>" by Kai Hwang et al.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-91192459698592731392013-01-03T11:30:00.000-08:002013-01-03T11:30:11.916-08:00The Era of Webapps is Over - Say Hello to Web APIs<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
<div style="text-align: justify;">
I remember a time when developing a web application (webapp) was considered one of the coolest and most exciting feats a software developer could perform. It was not so long ago when a software product with no web application component was considered uncool, unpresentable and unmarketable. A wide range of dynamic web programming technologies and standards were born in this era and they continued to thrive thanks to the unprecedented growth and evolution of the Internet. However today if we take a closer look at how the Internet is being used in our day-to-day lives and in the business world, one thing becomes hauntingly obvious. The era of the webapps has passed. Now is the era of the web APIs.</div>
<div style="text-align: justify;">
First of all lets take a closer look at webapps. Webapps are designed to be directly consumed by human users. The user is aware of the URL through which the application can be accessed which he/she enters into a web browser. The web browser then interacts with the remote URL by making HTTP GET requests to pull content and HTTP POST requests to submit content. HTML is used as the primary data exchange format and how the content appears on the user’s web browser (style and formatting) is actually a huge deal. All the important information that should be communicated to the user must be embedded in the HTML payload as that’s the only thing that’s going to get rendered on the screen by the web browser. Things like HTTP status codes and headers do not play a big role in webapps, except perhaps when reporting an error (404 Not Found, 500 Internal Server Error etc) and performing a redirect (302 Found + Location header). </div>
<div style="text-align: justify;">
But webapps are increasingly becoming less interesting to both developers and users. Today it’s all about web APIs. Interestingly most of the technologies that are used to develop webapps can also be used to create and expose web APIs. In fact it’s not wrong to say that web APIs are the next generation webapps and webapp development frameworks mutated into web API development frameworks as a consequence of the natural evolution of the web.</div>
<div style="text-align: justify;">
So how do web APIs differ from webapps? And what makes them cooler? Unlike webapps, web APIs are not designed to be directly consumed by human users. They are APIs, meaning they are designed to be consumed by other applications. Developers can use web APIs to construct other high level APIs and end-user applications. The end-user may or may not know the exact location or the URL with which the application is interacting with. Web APIs may use any content exchange format but JSON and XML are the most popular choices. A properly designed web API would use most of the available HTTP methods (at very least GET, POST, PUT, DELETE and OPTIONS) combined with the proper use of HTTP status codes and headers to pass critical control information. Things like layout and formatting don’t mean anything in the web API world but effective use of URL patterns, intuitiveness and simplicity of the APIs mean everything.</div>
<div style="text-align: justify;">
The end-user applications that are developed using web APIs can include desktop applications and more interestingly mobile applications. In my opinion this is the last nail in the coffin of the webapps. Simply put people are no longer browsing the web. Rather they use mobile apps. Latest reports and Internet usage surveys show that the amount of mobile Internet traffic is starting to bypass the amount of desktop Internet traffic (see the references section). This means the webapps are increasingly becoming obsolete. To point out a real world example lets take <a href="http://github.com/">GitHub</a>. GitHub is a pretty cool webapp, one of the best in my personal opinion. We can consume this app in a desktop environment using a traditional web browser such as Firefox. We can do the same with a smart phone, using the mini web browser the device is equipped with (eg: Safari for iPhones). But that’s not good enough for us. We need a native GitHub app for our smart phones. And as a result we now have the <a href="http://mobile.github.com/">GitHub app</a> for iPhone and Android platforms. Soon these mobile apps will become dominant traffic sources of GitHub. This has already happened to many of the popular social networking service providers such as <a href="https://twitter.com/download">Twitter</a> and <a href="https://www.facebook.com/mobile/">Facebook</a>. The web API is a more critical component in these systems compared to their webapp counterparts. </div>
<div style="text-align: justify;">
This transition from webapps to web APIs is a crucial one for technology driven companies. They have to carefully assess this trend and adjust their game plans accordingly. Many organizations have already realized the shift towards the web APIs and started to expose their business functionality as web APIs rather than as webapps. Web APIs open up businesses towards a larger and more diverse clientele with ample future proofing and more room for change. This massive push towards APIs has also given birth to the field of API management, which has now become a business of its own and starting to produce many lucrative business opportunities around the world. The growing number of on-premise and cloud API management solutions (Layer7, Apigee, Mashery etc) available is a testament to this fact.</div>
<div style="text-align: justify;">
Perhaps the most impacted by this change are the developers. They need to take a whole different stance in the way they think about software solutions that run over the web. They should learn to implement systems for other developers rather than for end users. They should learn to optimize systems for machine-to-machine interaction rather than for human computer interaction. They need to pay a lot of attention to little things like using proper HTTP codes, using proper HTTP headers, and adhering to open standards. Problems they are used to be messing with such as improving the browser compatibility of webapps and session management are becoming less important by day. They will have to learn to pay less attention to things like form authentication and adopt more HTTP-friendly security mechanisms such as <a href="http://tools.ietf.org/html/rfc2617">BasicAuth</a> and <a href="http://oauth.net/2/">OAuth</a>. API management is going to become a mainstream technology and developers will have to start treating it as primary development tool just like version controlling and issue tracking. Technology platform providers will also have to take this shift seriously. Developers no longer need webapp development frameworks. They need web API development frameworks. This is why platforms like <a href="https://developers.google.com/appengine/">Google App Engine</a> has become so successful. Their staying power resides in their ability to facilitate the development of powerful web APIs with very little amount of code.</div>
<div style="text-align: justify;">
So does all this mean traditional webapps are dead (as in dead for good)? Well, not really. The need for webapps will always be there, at least for the foreseeable future. But we will see more dominant deployment and adoption of web APIs compared to webapps. Webapps will become the Cobol of 21st century; Thousands of lines of code are written every year, but nobody gives a damn. Newly implemented webapps will sit on a layer of web APIs making the webapp just one of the many frontends of a larger system. </div>
<div style="text-align: justify;">
All in all, if you’re a developer, some very exciting times are ahead of you. So buckle up!</div>
<div style="text-align: justify;">
<br /></div>
<div style="text-align: justify;">
<u>References</u></div>
<div style="text-align: justify;">
<a href="http://news.cnet.com/8301-1023_3-57556943-93/mobile-internet-traffic-gaining-fast-on-desktop-internet-traffic/">http://news.cnet.com/8301-1023_3-57556943-93/mobile-internet-traffic-gaining-fast-on-desktop-internet-traffic/</a></div>
<div style="text-align: justify;">
<a href="http://bits.blogs.nytimes.com/2011/10/12/mobile-accounts-for-7-percent-of-web-traffic-report-says/">http://bits.blogs.nytimes.com/2011/10/12/mobile-accounts-for-7-percent-of-web-traffic-report-says/</a></div>
<div style="text-align: justify;">
<a href="http://www.kpcb.com/file/kpcb-internet-trends-2012">http://www.kpcb.com/file/kpcb-internet-trends-2012</a></div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-89667531151282078282012-12-31T16:11:00.001-08:002012-12-31T16:12:23.550-08:00Creating and Uploading Eucalyptus Images<div dir="ltr" style="text-align: left;" trbidi="on">
By far the best reference I've found on the subject: <a href="http://virtually-a-machine.blogspot.com/2009/09/eucalyptus-5-creating-and-running-vms.html">http://virtually-a-machine.blogspot.com/2009/09/eucalyptus-5-creating-and-running-vms.html</a><br />
<div>
<br />
Simple, elegant and most importantly no BS! I've tried these steps out several times in Euca2 and Euca3 and they work like a charm. Nice work Igor.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0tag:blogger.com,1999:blog-4206392247746930256.post-51985748609227491722012-12-29T17:01:00.004-08:002012-12-29T17:03:11.519-08:00Deploying Applications in the Cloud Using AppScale<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: justify;">
In my last two posts I briefly explained how to setup <a href="http://appscale.cs.ucsb.edu/">AppScale</a> and get it up and running. Once you have a running AppScale PaaS, you can start deploying webapps in the cloud. You develop webapps for AppScale using <a href="https://developers.google.com/appengine/">Google App Engine</a> (GAE) SDK. AppScale is fully API compatible with GAE and therefore any GAE application can be deployed on AppScale with no code changes. If you do not have any GAE applications to try on AppScale, you may follow one of the <a href="https://developers.google.com/academy/apis/cloud/appengine/">official GAE tutorials</a> and develop a sample GAE application using Python, Java or Go. Alternatively you can checkout the AppScale <a href="https://github.com/AppScale/sample-apps">sample-apps</a> repository and try to deploy one of the pre-packaged sample apps. To checkout the sample-apps repository, execute the following command on a shell:</div>
<pre>git clone https://github.com/AppScale/sample-apps.git</pre>
<div style="text-align: justify;">
This will checkout a directory named sample-apps to your local disk. This directory contains 3 subdirectories named python, java and go. As the names suggest, each subdirectory contains a number of sample GAE applications written in the corresponding language. One of the simplest sample applications available for you to try out is the "guestbook" application. The Python version of the application can be found in the sample-apps/python/guestbook directory and the Java version of it can be found in the sample-apps/java/guestbook directory. This application provides a simple web front-end for users to enter a comment and browse comments entered by other users. It uses the GAE datastore API under the hood to store and retrieve comments entered by users. You can use <a href="http://code.google.com/p/appscale/wiki/AppScale_Tools_Usage">AppScale-Tools</a> to upload the guestbook application to your AppScale cloud.</div>
<pre>appscale-upload-app --file samples-apps/python/guestbook --keyname appscale_test</pre>
<div style="text-align: justify;">
The keyname flag should indicate the keyname you provided when starting the AppScale instance using the <a href="http://code.google.com/p/appscale/wiki/AppScale_Tools_Usage#appscale-run-instances">appscale-run-instances</a> command. Once you execute the above command you will be prompted to enter the admin e-mail address for your application. Here you can enter the admin e-mail address you used when starting AppScale. Application deployment could take a few minutes. If everything goes smoothly, tools will print the URL through which your webapp can be accessed.</div>
<pre>Your app can be reached at the following URL: http://ec2-174-129-188-141.compute-1.amazonaws.com/apps/guestbook</pre>
<div style="text-align: left;">
Try uploading a few applications using the above command and see how it goes. You can also try developing your own custom apps and deploying them in the cloud. </div>
<div style="text-align: left;">
AppScale-Tools also allow you to start an AppScale cloud with an application preloaded. To invoke this feature, you simple need to pass the --file option to the appscale-run-instances command.</div>
<pre style="background-color: white; color: #09111a; line-height: 16px; text-align: justify;">appscale-run-instances --min 10 --max 10 --infrastructure euca --machine emi-12345678 --keyname appscale_test --group appscale_test --file samples-apps/python/guestbook</pre>
<div style="text-align: justify;">
To undeploy an application you can use the <a href="http://code.google.com/p/appscale/wiki/AppScale_Tools_Usage#appscale-remove-app">appscale-remove-app</a> command.</div>
<pre>appscale-remove-app --appname guestbook --keyname appscale_test</pre>
<div style="text-align: justify;">
Finally you can terminate and tear down an AppScale PaaS using the <a href="http://code.google.com/p/appscale/wiki/AppScale_Tools_Usage#appscale-terminate-instances">appscale-terminate-instances</a> command.</div>
<pre>appscale-terminate-instances --keyname appscale_test</pre>
<div style="text-align: justify;">
If your AppScale PaaS was running in over an IaaS layer such as EC2, the above command will also take care of terminating the VMs in the cloud.</div>
<div style="text-align: justify;">
In my next few posts, I'll explain a little bit about GAE APIs and how to implement cool apps for AppScale using those APIs.</div>
</div>
Hiranya Jayathilakahttp://www.blogger.com/profile/17230790150335483296noreply@blogger.com0