tag:blogger.com,1999:blog-54313370369586565742024-02-20T21:48:07.036+01:00Distributed Computation & other computer science stuff ...\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b
\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd
\x80\xe8\xdc\xff\xff\xff/bin/shAlberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.comBlogger25125tag:blogger.com,1999:blog-5431337036958656574.post-82390608576336043832014-12-28T17:46:00.004+01:002014-12-28T17:55:26.759+01:00Posting App Video to iTunes Connect (iOS8+)<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: left;">
<div style="text-align: justify;">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmpULR4_TwmecmF8uizWWgRC9El7K8y2xZbbmy62yYFeHB6uyi4Um1emUVKyyr-UA7KQaHP8y3HcfFL1SadUjuf2bT9LUW6Brhov65u37YGzn86x2twWHoC_glNHXq-0IO8uHAPReuQ3s/s1600/Screen+Shot+2014-12-28+at+17.44.36.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmpULR4_TwmecmF8uizWWgRC9El7K8y2xZbbmy62yYFeHB6uyi4Um1emUVKyyr-UA7KQaHP8y3HcfFL1SadUjuf2bT9LUW6Brhov65u37YGzn86x2twWHoC_glNHXq-0IO8uHAPReuQ3s/s1600/Screen+Shot+2014-12-28+at+17.44.36.png" height="117" width="400" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<span style="background-color: black;"><span style="color: white; font-family: Trebuchet MS, sans-serif;"><span style="font-size: x-small;">I really cannot manage</span><span style="font-size: x-small;"> the 1334x750 resolution for iPhone 6. This seems like a non standard H264 resolution. I'm using iMovie, VLC and Quicktime. No way.</span></span></span></div>
</div>
<div style="text-align: left;">
<div style="text-align: justify;">
<span style="background-color: black; color: white; font-family: Trebuchet MS, sans-serif; font-size: x-small;">It is a pain every time I drag my video I still get the same error message: <em>The app video preview dimensions should be: 1334x750, 750x1334</em>.</span></div>
</div>
<div style="text-align: left;">
<div style="text-align: justify;">
<span style="background-color: black; color: white; font-family: 'Trebuchet MS', sans-serif; font-size: x-small;">Then I found </span><a href="http://www.reddit.com/r/iOSProgramming/comments/2ihiv3/my_adventure_of_posting_app_trailers_to_itunes/" style="background-color: black; font-family: 'Trebuchet MS', sans-serif;">this post</a><span style="background-color: black; color: white; font-family: 'Trebuchet MS', sans-serif; font-size: x-small;">. But the it does not share code details, so I got my time to debug the messy itunesConnect js. </span></div>
</div>
<div style="text-align: left;">
<div style="text-align: justify;">
<span style="font-size: x-small;"><span style="background-color: black; color: white; font-family: Trebuchet MS, sans-serif;">It took about 20 minutes, so as I want to save your time guys, here the trick: </span></span></div>
</div>
<div style="text-align: left;">
<div style="text-align: justify;">
<span style="font-size: x-small;"><span style="background-color: black; color: white; font-family: Trebuchet MS, sans-serif;">Look for </span><span style="background-color: black; color: white; font-family: Courier New, Courier, monospace;">dorp_directive.js</span><span style="background-color: black; color: white; font-family: Trebuchet MS, sans-serif;"> and set the breakpoint at the red line </span></span><br />
<span style="font-size: x-small;"><span style="background-color: black; color: white; font-family: Trebuchet MS, sans-serif;"><br /></span></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> var loadFunc = function() {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> //console.log("dummy video loaded");</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> var width = this.videoWidth;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> var height = this.videoHeight;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> var dimensionsArr = new Array();</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> var expectedW, expectedH, expectedDimensionsArr;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> for (var i = 0; i < validSizesForDevice.length; i++) {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> expectedDimensionsArr = validSizesForDevice[i].split("x");</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> <span style="color: red;"> expectedW = parseInt(expectedDimensionsArr[0]);</span></span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> expectedH = parseInt(expectedDimensionsArr[1]);</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> if (expectedW === width && expectedH === height) {</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> scope.continueWithUpload(file, url, 'videoDropped');</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> return;</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> }</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: xx-small;"> }</span><br />
<span style="color: white; font-family: Trebuchet MS, sans-serif; font-size: x-small; text-align: left;"><br /></span>
<span style="background-color: black;"><span style="color: white; font-family: Trebuchet MS, sans-serif; font-size: x-small; text-align: left;">change the value in the local array</span><span style="color: white; text-align: left;"> </span><span style="color: white; font-family: 'Courier New', Courier, monospace; font-size: xx-small; text-align: left;">expectedDimensionsArr </span><span style="color: white; font-family: Trebuchet MS, sans-serif; font-size: x-small; text-align: left;">so you can successfully pass the check and hit the </span><span style="color: white; font-family: 'Courier New', Courier, monospace; font-size: xx-small; text-align: left;">scope.continueWithUpload</span><span style="color: white; font-family: 'Helvetica Neue', Arial, Helvetica, sans-serif; font-size: x-small; text-align: left;"> </span><span style="color: white; font-family: Trebuchet MS, sans-serif; font-size: x-small; text-align: left;">routine call.</span></span></div>
</div>
<span style="background-color: black; color: white; font-family: Trebuchet MS, sans-serif; font-size: x-small;">That's it. </span><br />
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<span style="font-family: Helvetica Neue, Arial, Helvetica, sans-serif; font-size: x-small;">Hope this is helpful.</span></div>
<span style="font-family: Helvetica Neue, Arial, Helvetica, sans-serif; font-size: x-small;"><br /></span></div>
Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-80657078749908571462011-10-09T09:04:00.004+02:002014-12-28T17:34:59.192+01:00Great losts<div dir="ltr" style="text-align: left;" trbidi="on">
<span style="font-family: trebuchet ms;">Steve Jobs 1955 - 2011</span><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://upload.wikimedia.org/wikipedia/commons/thumb/b/b9/Steve_Jobs_Headshot_2010-CROP.jpg/1920px-Steve_Jobs_Headshot_2010-CROP.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/b/b9/Steve_Jobs_Headshot_2010-CROP.jpg/1920px-Steve_Jobs_Headshot_2010-CROP.jpg" height="196" width="200" /></a></div>
<br />
<a href="http://www.davekemick.com/blog/wp-content/uploads/2011/08/2313082920_d70feb5a9f_o.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"><span style="font-size: 85%;"> </span><span style="font-size: 85%;">Together with Steve we have recently lost two other big mentors:</span><span style="font-size: 85%;"><span style="font-family: trebuchet ms;"><br /></span><b>Dennis MacAlistair Ritchie</b> (b. September 9, 1941; found dead October 12, 2011:</span><br /><span style="font-size: 85%;"><span style="font-family: trebuchet ms;"></span></span></a><span style="font-size: 85%;"><span style="font-family: trebuchet ms;"><a href="http://upload.wikimedia.org/wikipedia/commons/thumb/0/01/Dennis_MacAlistair_Ritchie_.jpg/225px-Dennis_MacAlistair_Ritchie_.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"><img alt="" border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/0/01/Dennis_MacAlistair_Ritchie_.jpg/225px-Dennis_MacAlistair_Ritchie_.jpg" style="cursor: hand; cursor: pointer; display: block; height: 260px; margin: 0px auto 10px; text-align: center; width: 225px;" /></a></span></span><br />
<br />
<span style="font-size: 85%;"><b>John McCarthy</b> (September 4, 1927 – October 24, 2011)</span><a href="http://en.wikipedia.org/wiki/John_McCarthy_%28computer_scientist%29#cite_note-0"><br /></a><a href="http://upload.wikimedia.org/wikipedia/commons/thumb/4/49/John_McCarthy_Stanford.jpg/200px-John_McCarthy_Stanford.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"><img alt="" border="0" src="http://upload.wikimedia.org/wikipedia/commons/thumb/4/49/John_McCarthy_Stanford.jpg/200px-John_McCarthy_Stanford.jpg" style="cursor: hand; cursor: pointer; display: block; height: 133px; margin: 0px auto 10px; text-align: center; width: 200px;" /></a><a href="http://www.davekemick.com/blog/wp-content/uploads/2011/08/2313082920_d70feb5a9f_o.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"><br /></a></div>
Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-83440870660845704732011-09-05T07:45:00.003+02:002011-09-05T07:51:40.376+02:00IBM SyNAPSE<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://blogs-images.forbes.com/alexknapp/files/2011/08/cognitive-chip.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 194px; height: 192px;" src="http://blogs-images.forbes.com/alexknapp/files/2011/08/cognitive-chip.jpg" alt="" border="0" /></a>
<br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">From </span><a style="font-family: trebuchet ms;" href="http://blogs.forbes.com/alexknapp/">Alex Knapp</a><span class="desc" style="font-family:trebuchet ms;">'s post on linked-in:
<br />
<br />
<br />"</span><span style="font-family:trebuchet ms;"> </span> <span style="font-family:trebuchet ms;">Dr. Modha says, “Our achievements our humble but our aspirations are lofty.” So far, the chip is fairly limited in what it can do. For example, if you show it (through programming) what a triangle looks like, then show it just a part of a triangle, its algorithms are able to display a full triangle back. That is, it recognizes a whole triangle from just a part of a triangle. You can also play Pong with it. “And sometimes it will actually win,” laughs Dr. Modha.</span> <span style="font-family:trebuchet ms;">...</span> <strong style="font-family: trebuchet ms;">The Problem With Modern Computers</strong><span style="font-family:trebuchet ms;"> </span></span><p style="font-family: trebuchet ms;"><span style="font-size:85%;">For the past half-century, most computers run on what’s known as von Neumann architecture, and the computer I’m writing this on and the computer you’re reading this on definitely run on von Neumann architecture. In a von Neumann system, the processing of information and the storage of information are kept separate. Data travels to and from the processor and memory – but the computer can’t process and store at the same time. By the nature of the architecture, it’s a linear process. That’s why software is written as a set of instructions for a computer to follow – it’s a linear sequence of events, built for a linear process. This is where clock speed comes in – the faster the clock speed (for example, the 4 cores on my processor run at 3.0GHz), the faster the computer can process those linear instructions.</span></p><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> </span></span><p style="font-family: trebuchet ms;"><span style="font-size:85%;">According to Dr. Modha, von Neumann architecture was essential to develop computers in the days of vacuum tubes and early transistors, but modern chipbuilding techniques have exposed its limitations. “We’ve gotten to the point where we can pack more transistors on a chip than we can actually power, because if we powered them all, they’d burn out” due to the excess heat created by the electricity in the chip. Dr. Modha likens processing to transporting oranges. The trees are memory, the oranges are bits. The consumers are processors. The oranges have to travel, by highway, to get to the consumer – but the more oranges, the more tied up traffic gets, so you run into problems on the chip.</span></p><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> Solving this problem is the focus of computer scientists around the world. The SyNAPSE team’s solution is to bypass von Neumann architecture entirely with cognitive computing. To keep the orange grove analogy, the SyNAPSE team wants to, in Dr. Modha’s words, “move people to the orange grove” so the processors can be integrated with the memory.</span> </span><p style="font-family: trebuchet ms;"><span style="font-size:85%;">Dr. Modha’s team isn’t the first group trying to practically bypass von Neumann architecture. But what makes their approach unique is that they’re taking their inspiration from the way human neural architecture works. It’s not emulation, <em>per se</em> – as Dr. Modha was quick to point out, “Comparing what we can do to what mother nature can do is quite humbling.”</span></p><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> </span></span><p style="font-family: trebuchet ms;"><span style="font-size:85%;">That said, I think their solution to computing is quite elegant. Here’s the basics of how the chip works. What they’ve been able to achieve right now is a chip with 256 processors (which the team has dubbed “neurons”) laid out in an array of rows and columns. The neurons process in parallel, rather than relying on linear structures, and are connected to 1024 axons on the chips by synapses – which is where the memory is stored. The axons act to either excite or hinder the power going through the synapses to the processors. Depending on the power and information its getting from the axons and synapses, the neuron determines whether its reached its predetermined “threshold potential” – basically, whether its found a solution to the problem or part of the problem put to it. If it has, it will “spike” – sending a signal back through the synapse – and reset itself.</span></p><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> </span></span><p style="font-family: trebuchet ms;"><span style="font-size:85%;">The synapse then has the solution sent from the neuron, while the neuron goes into a state where it’s awaiting further information. Now picture all 256 neurons acting at the same time, with their signals modulated by the actions of the synapses and axons, and you can see the potential. All 256 neurons are working in parallel to each other, rather than simply acting on a set of linear instructions. The goal, says Dr. Modha, is a chip that’s able to better handle environmental feedback. “So far, our computers are left-brained, focusing on linearity and computation. What this is is a right-brained computer” capable of recognizing patterns and able to handle ambiguity. And because it’s operating in parallel, with an integrated memory, it uses much less power than a traditional processor. When one neuron spikes, the total active energy used is only 45 picojoules – that’s 4.5 x 10^-11 Joules, a <em>very</em> small amount of energy.
<br />
<br />"
<br />
<br />IBM web site: <a href="http://www.ibm.com/smarterplanet/us/en/business_analytics/article/cognitive_computing.html">http://www.ibm.com/smarterplanet/us/en/business_analytics/article/cognitive_computing.htm</a>l
<br />
<br /></span></p><span style="font-size:85%;"> </span>
<br />Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-12946518847436905032011-05-07T14:03:00.006+02:002011-05-07T14:12:17.564+02:00Integrated CMOS Tri-Gate Transistors<div style="text-align: left;"><span style=";font-family:trebuchet ms;font-size:85%;" >The semiconductor industry continues to push technological innovation to keep pace with Moore’s Law, shrinking transistors so that ever more can be packed on a chip. However, at future technology nodes, the ability to shrink transistors becomes more and more problematic, in part due to worsening short channel effects and an increase in parasitic leakages with scaling of the gate-length dimension. Both transistor off-state leakage (which increases with reducing gate length dimension) and gate oxide leakage (which increases with decreasing gate dielectric thickness) are contributing to the increase in power dissipation with scaling.</span><br /></div><p style="text-align: left;font-family:trebuchet ms;"><span style="font-size:85%;">To address the transistor off-state leakage issue, in 2002 Intel developed the world’s first CMOS tri-gate transistor ... </span> <span style="font-size:85%;">continue here: </span><span style="font-size:85%;"><a href="http://www.intel.com/technology/silicon/integrated_cmos.htm">http://www.intel.com/technology/silicon/integrated_cmos.htm</a></span></p><p style="text-align: justify;font-family:trebuchet ms;"><span style="font-size:85%;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.cdrinfo.com/images/uploaded/Intel_32nm_Planar_22nmTriGate.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 360px; height: 190px;" src="http://www.cdrinfo.com/images/uploaded/Intel_32nm_Planar_22nmTriGate.jpg" alt="" border="0" /></a></span><br /></p><div style="text-align: justify;"><span style="font-size:85%;"><a style="font-family: trebuchet ms;" href="http://www.intel.com/technology/silicon/integrated_cmos.htm"><br /></a></span></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-82553723005615641442010-11-02T07:05:00.011+01:002010-11-02T07:44:04.543+01:00Convey HC-1 x86 + FPGA<div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">There are 14 FPGAs on the Convey coprocessor. Four FPGAs serve as the application engines that execute the personality’s instruction set; two FPGAs comprise the application engine hub that handles </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">communication to and</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> from the host x86 processor; and eight FPGA-based memory controllers provide the very fast memory.</span></span><br /></div><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHyJ7mRNvyOIJd-1bV5qZxb_CHcjq-AG-YgM8Eq1I5GKnLmCtySylk1E0YRrsNXGDhF9K4l4w_CXv7xLraZzb6yubHttzHnD3oYqzXCIX-FlmRO7GHEH8yaHYbGUHvCmdzDZofifUYACI/s1600/HC1.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 400px; height: 230px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHyJ7mRNvyOIJd-1bV5qZxb_CHcjq-AG-YgM8Eq1I5GKnLmCtySylk1E0YRrsNXGDhF9K4l4w_CXv7xLraZzb6yubHttzHnD3oYqzXCIX-FlmRO7GHEH8yaHYbGUHvCmdzDZofifUYACI/s400/HC1.jpg" alt="" id="BLOGGER_PHOTO_ID_5534831977510677010" border="0" /></a><br /></span></span><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">“Convey systems are 2U rack-mountable systems and run industry-standard Linux. They cluster like other off the shelf servers, are managed like other servers, and connect into your HPC fabric like other servers. The point is, to the outside world, there’s nothing exotic about the HC-1. It’s just blindingly faster on bioinformatics algorithms than any off-the-shelf server”</span></span>.<br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">...</span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">"Convey’s hybrid-core computers</span><span style="font-family:trebuchet ms;"> have achieved impressive performance on a variety of bioinformatics applications, allowing researchers to tackle problems previously deemed impractical. For example: </span></span></div><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">• Sequencing. The Convey implementation of the</span> <span style="font-family:trebuchet ms;">Smith-Waterman algorithm (used for aligning DNA and protein sequences) is 172x faster than the best software implementation on conventional servers and represents the fastest Smith-Waterman implementation on a single system to date.</span></span></div><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">• Proteomics. University of California, San Diego (UCSD) researchers achieved a roughly 100-fold faster performance of their sophisticated MS/MS database search tool program — InsPecT — that is able to accurately identify post-translational modifications (PTM).</span> </span></div><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">• Computational Phylogeny Inference. The University of South Carolina developed and accelerated MrBayes, a phylogenetics application able to accurately infer “evolutionary trees,” a problem that was previously considered impractical on most computer systems. Performance is significantly faster even than other FPGA implementations.</span></span></div><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;"></span><span style="font-family:trebuchet ms;">• Genomics. The Virginia Bioinformatics Institute (VBI) is using Convey hybrid-core systems for its microsatellite analysis work for the 1000 Genomes Project, an international effort to sequence the genomes of approximately 2,500 people from about 20 populations around the world."</span> </span><br /><br /></div><object height="390" width="640"><param name="movie" value="http://www.youtube.com/v/cbh7n5DrmxQ&hl=en_US&feature=player_embedded&version=3"><param name="allowFullScreen" value="true"><param name="allowScriptAccess" value="always"><embed src="http://www.youtube.com/v/cbh7n5DrmxQ&hl=en_US&feature=player_embedded&version=3" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" height="390" width="640"></embed></object>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-1459214619402953152010-08-24T18:01:00.004+02:002010-11-02T07:44:38.807+01:00Analog computing: Bayesian probability based NAND gates<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://static.arstechnica.com/assets/2010/08/lyric_lec_cpu_ars-thumb-640xauto-15879.jpg"><img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 531px; height: 298px;" src="http://static.arstechnica.com/assets/2010/08/lyric_lec_cpu_ars-thumb-640xauto-15879.jpg" alt="" border="0" /></a><br /><p style="font-family:trebuchet ms;"><span style="font-size:85%;"><br /></span></p><p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">"A DARPA-funded processor start-up has made bold claims about a new kind of processor that computes using probabilities, rather than the traditional ones and zeroes of conventional processors. Lyric Semiconductor, an MIT spin-off, claims that its probabilistic processors could speed up some kinds of computation by a factor of a thousand, allowing racks of servers to be replaced with small processing appliances.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">Calculations involving probabilities have a wide range of applications. Many <a href="http://en.wikipedia.org/wiki/Bayesian_spam_filtering">spam filters</a>, for example, work on the basis of probability; if an e-mail contains the word "Viagra" it's more likely to be spam than one which doesn't, and with enough of these likely-to-be-spam words, the filter can flag the mail as being spam with a high degree of confidence. Probabilities are represented as numbers between 0, impossible, and 1, certain. A fair coin toss has a probability of 0.5 of coming up heads.</span></p> <!--page 1--> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">Traditional computers are based on digital logic. The signals used inside processors are either 0 volts, for a zero, or the full voltage of the circuit (V<sub>DD</sub>, in integrated circuit jargon) for a one. This has some nice properties: because the circuits only need to be fully on or fully off, they're quite robust against noise; a signal that momentarily goes a little bit above 0 V or drops a bit below V<sub>DD</sub> won't affect the interpretation; it'll still be a zero or a one. The nice, simple, zero-or-one nature of the circuit also makes it easier to reason about how it works, which makes building processors—and the software to run on them—easy.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">The big trade-off this causes is that most of the time, we want to manipulate numbers other than zero or one. There are infinite possible voltages between 0 and V<sub>DD</sub>, which could be used to represent <a href="http://en.wikipedia.org/wiki/Cardinality_of_the_continuum#Sets_with_cardinality_c">all the numbers we actually care about</a> in a nice compact way. With binary circuits, however, we're stuck with just two possible values—one binary digit. To represent all the other numbers we want to use, we have to use multiple digital signals—multiple bits—and binary arithmetic.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">This works very well—the computer revolution is testament to that—but it carries with it a certain level of complexity. Processors now have to manipulate dozens of signals together in order to work with simple concepts like "42" or "0.5", typically 32 or 64 at a time.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">Which brings us back to the probability computations. Normal computers use 32- or 64-bit floating point arithmetic to calculate probabilities. For computers that need to compute probabilities very quickly, this means that they need a lot of circuitry to handle all these bits, and indeed, that's why today's processors have billions of transistors.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">Lyric's innovation is to use analogue signals instead of digital ones, to allow probabilities to be encoded directly as voltages. Their probability gates represent zero probability as 0 V, and certainty as V<sub>DD</sub>. But unlike digital logic, for which these are the only options, Lyric's technology allows probabilities between 0 and 1 to use voltages <em>between</em> 0 and V<sub>DD</sub>. Each probabilistic bit ("pbit") stores not an exact value, but rather, the <em>probability</em> that the value is 1. The technology allows a resolution of about 8 bits; that is, they can discriminate between about 2<sup>8</sup> = 256 different values (different probabilities) between 0 and V<sub>DD</sub>.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">By creating circuits that can operate directly on probabilities, much of the extra complexity of digital circuits can be eliminated. Probabilistic processors can perform useful computations with just a handful of pbits, with a drastic reduction in the number of transistors and circuit complexity as a result.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">To do something useful with these pbits requires suitable <a href="http://en.wikipedia.org/wiki/Logic_gate">logic gates</a>, the building blocks of integrated circuits. The most important of these is the NAND gate. This is a gate with two inputs, with an output that is 1 as long as at least one of the inputs is 0, and 0 if they are both 1. NAND gates are important because any other gate can be constructed from one or more NAND gates. For example, a NOT gate, which outputs a 1 if the input is 0 and a 0 if the input is 1, can be constructed by sending the same value to both inputs of a NAND gate.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">Lyric Semiconductor has developed NAND gates that operate on probabilities rather than binary values. The input probabilities are combined using <a href="http://en.wikipedia.org/wiki/Bayesian_probability">Bayesian logic</a> rather than the <a href="http://en.wikipedia.org/wiki/Classical_logic">binary logic</a> of conventional processors. </span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">The company plans to build a processor dubbed GP5 that uses this technology to accelerate probability-heavy workloads, with applications such as spam filtering, shopping pattern analysis, and fraudulent transaction discovery being possible candidates. PCIe-based GP5 accelerator cards will give machines a thousand-fold improvement in probability computing performance. As well as the processor, the company is also creating a programming language, Probability Synthesis to Bayesian Logic, for programming the device. GP5 is still in the design stage, and the company plans to show it off in 2013.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">The GP5's logic gates will be interconnected to enable complex probabilistic calculations to be performed in an inherently parallel way. This will give a further improvement over traditional computers, as they must perform their probability calculations in a serial manner.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">More immediately, Lyric has devised a probability-based circuit for error-correction in flash memory. Flash memory produces something around one error in every 1,000 bits read. This is transparent to users, because the memory also includes circuitry to detect and correct these errors. This, however, adds complexity and takes up space, and as flash storage densities increase, the error rate is likely to worsen. Lyric Error Correction uses the same probabilistic logic to perform equivalent error detection and correction with about 3 percent of the circuitry, and 8 percent of the power, compared to conventional error correction. LEC is available to use today, with chips manufactured by TSMC.</span></p> <p style="text-align: justify;font-family: 'trebuchet ms'; "><span style="font-size:85%;">Lyric Semiconductor hopes that this technology will eventually become mainstream. It claims that GP5 is a fifth generation of processor, following on from CPUs, <a href="http://en.wikipedia.org/wiki/Digital_signal_processor">DSPs</a>, <a href="http://en.wikipedia.org/wiki/Field-programmable_gate_array">FPGAs</a>, and GPUs. The first customers will likely be DARPA and unspecified three-letter agencies, and the initial devices will be expensive, but this is a technology that the company believes will one day be found in every computer."</span></p>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-86392168596131593192010-08-08T09:07:00.002+02:002010-08-08T11:39:21.671+02:00Intel Silicon Photonics<object width="640" height="385"><param name="movie" value="http://www.youtube.com/v/vz3DaACN_54&hl=en_US&fs=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/vz3DaACN_54&hl=en_US&fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="430" height="260"></embed></object>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-83634806588246573672010-03-29T23:14:00.005+02:002010-11-02T07:19:54.500+01:00Intel terascale<span style="font-size:85%;"><span style="font-family:trebuchet ms;">Intel belives threads are the answer to parallel/distributed computing. That's a good market strategy for the moment, but it is not the final solution to a real parallel computing. Every program written in a OO or structural programming follows an imperative paradigm. It makes simple and easy the creation of sequential sets of instructions. Threads are the easiest way to introduce parallelism into such a paradigm, but they are not simple to manage: critical sections and side effects make the programs flow difficult to design and debug and often are at the base of inefficiencies and errors.<br />Functional programming languages makes the parallelism easy and fully exploitable, but they are not good to project large software platforms .... how to solve this problem?<br />Today we have the technology to spatially (i.e. in parallel) execute small functional graphs (or circuits) automatically generated from sequential set if instructions. Look at this <a href="http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4380708">paper</a>. This could be the final solution to<br />the parallel computing dilemma: "parallel formalism"-"difficult programming" or "easy programming"-"sequential formalism".<br /><br />... For the moment look at the terascale tech from Intel.<br /></span><br /></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">"Intel Labs has created an experimental “Single-chip Cloud Computer,” a research microprocessor containing the most Intel Architecture cores ever integrated on silicon CPU chip – 48 cores. It incorporates technologies intended to scale multi-core processors to 100 cores and beyond, such as an on-chip network, advanced power management technologies and support for “message-passing.” Architecturally, the chip resembles a cloud of computers integrated into silicon."<br /><br /><object style="height: 390px; width: 640px;"><param name="movie" value="http://www.youtube.com/v/L_cXi7uyJU4?version=3"><param name="allowFullScreen" value="true"><param name="allowScriptAccess" value="always"><embed src="http://www.youtube.com/v/L_cXi7uyJU4?version=3" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" height="390" width="640"></embed></object><br /></span></span><br /><ul style="font-family:trebuchet ms;"><li><span style="font-size:85%;"><strong>Scalable multi-core architectures</strong> which integrate streamlined processor cores and accelerators using a fast, energy-efficient, modular core-to-core infrastructure.<br /><em>Examples: <a href="http://techresearch.intel.com/articles/Tera-Scale/1449.htm">80-core prototype</a>, <a href="http://techresearch.intel.com/UserFiles/en-us/File/terascale/posters/MCEMU.pdf">Tera-scale Emulator</a>, Dynamic Thermal Management, <a href="http://www.intel.com/technology/itj/2007/v11i3/5-architectural_support/1-abstract.htm">Task Queues</a></em></span> <span style="font-size:85%;">.<br /> </span></li><li><span style="font-size:85%;"><strong>Memory sharing and stacking</strong> to provide a high bandwidth, flexible, cache & memory hierarchy which supports many simultaneous threads fairly and efficiently.<br /><em>Examples: <a href="http://www.intel.com/technology/itj/2007/v11i3/3-bandwidth/1-abstract.htm">3D Stacking</a>, Cache Quality of Service.<br /> </em></span> </li><li><span style="font-size:85%;"><strong>High Bandwidth I/O and Communications</strong> which balance the compute demands with I/O and network demands within the platform power and cost budgets.<br /><em>Examples: High-speed Copper I/O, <a href="http://techresearch.intel.com/articles/Tera-Scale/1419.htm">Silicon Photonics</a>, I/O Accelerators.</em></span> <span style="font-size:85%;"> </span></li></ul><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><br /></span></span>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-70312893334052680892009-09-19T09:17:00.002+02:002009-09-19T09:23:07.870+02:00Cell Matrix<span style="font-size:85%;">Another example of Cellular Architecture. It is older than previous but more interesting in my opinion:</span><br /><applet archive="http://www.cellmatrix.com/entryway/products/software/cellmatrix.jar" code="com.glendhu.cellmatrix.Simulator.class" height="450" width="600"><br /></applet><br /><br /><span style="font-size:85%;">"The cell matrix is a medium for generating custom electronic circuitry. The power and the subtlety behind the cell matrix is the fact that the circuitry need not be defined statically. Analagous to the compile-time versus run-time notion in software programming, the cell matrix allows static or dynamic circuit definitions. The programmer can create circuits which are a combination of static and dynamic regions. This has huge implications for problem-solving, and it extends the power of computers into some brave new worlds.<br />Circuits implemented on a cell matrix can be massively parallel. Control of the circuit can also be massively parallel. Unlike an FPGA, which is an externally configured device, the circuits within the cell matrix do not require external intervention or control to modify their behavior. Rather, the circuits within the cell matrix work together to generate and load configuration information into each other.<br />The scope of dynamic circuit definition can be predetermined by the programmer or not. The cell matrix is capable of learning, changing, evolution, and autonomous operation."<br /><br />see <a href="http://www.cellmatrix.com/">http://www.cellmatrix.com/ </a></span>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com14tag:blogger.com,1999:blog-5431337036958656574.post-59229441570900581392009-08-22T08:07:00.004+02:002009-08-22T08:18:55.713+02:00Illuminato X Machina<span style="font-size:85%;"><span style="font-family: trebuchet ms;">This is a very nice example of asynchronous cellular computation:</span></span><br /><br /><p><span style="text-align: center; display: block;"><object height="350" width="425"><param name="movie" value="http://www.youtube.com/v/ZBFoFYhC9B4&rel=1&fs=1&showsearch=0&hd=0"> <param name="allowfullscreen" value="true"> <param name="wmode" value="transparent"> <embed src="http://www.youtube.com/v/ZBFoFYhC9B4&rel=1&fs=1&showsearch=0&hd=0" type="application/x-shockwave-flash" allowfullscreen="true" wmode="transparent" height="350" width="425"></embed> </object></span></p>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-28710672035887604192009-01-02T09:29:00.004+01:002009-01-05T12:50:24.667+01:00Ct: C for Throughput Computing<span style=";font-family:trebuchet ms;font-size:85%;" >As the </span><span style=";font-family:trebuchet ms;font-size:85%;" >computational power of Intel architectures grows (by reducing transistor size and adding some more core ), new programming tools are required to exploit this new power. I actually prefer a different approach/solution to the problems that affect micro-electronic at the current stage: I think ILP more than MT could be the right programming model for the next generation architectures, but at the same time think that MT is the most suitable by the "market" point of view. Probably there are a lot of "things to do" by means of MT before introducing programmable logic into the consumer-level core.<br /><a href="http://techresearch.intel.com/articles/Tera-Scale/1514.htm">Here</a> we find a good example of "improved C/C++" to make easier to exploit MT on Intel's core. A very interesting use case is described by<a href="http://techresearch.intel.com/userfiles/en-us/File/terascale/Ct-appnote-option-pricing.pdf"> this white paper</a> on option pricing. Montecarlo algorithms take advantages from MT.<br /></span>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-44431457962502152042008-05-11T16:25:00.006+02:002008-05-12T09:26:31.058+02:00Meristor: analysis and impacts.<div style="text-align: justify;font-family:trebuchet ms;"><span style="font-size:85%;">Some <span style="font-style: italic;">cut&paste</span> from other site just to stress the importance of this new device:<br /><br /></span><div style="text-align: center;"><span style="font-size:85%;"><span style="font-weight: bold;"> "memristor" for memory resistor, would register how much current had passed.<br /><br /></span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.hpl.hp.com/news/2008/apr-jun/images/memristor_article.gif"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 348px; height: 117px;" src="http://www.hpl.hp.com/news/2008/apr-jun/images/memristor_article.gif" alt="" border="0" /></a></div><span style="font-size:85%;"><span style="font-weight: bold;"><br /></span></span></div><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">"If memristors can be commercialized, it could lead to very dense, energy-efficient memory chips. Scientists have made devices that function like memristors, but it took a good number of transistors and several capacitors, Williams said. Memristor chips would function like flash memory and retain data even after a computer is turned off, but require less silicon, consume less energy,"<br />...<br />see <a href="http://en.wikipedia.org/wiki/Memristor">wikipedia </a>for a good formal description of memristor behaviour.<br />...<br /></span></span><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">A memristor effectively stores information because the level of its electrical resistance changes when current is applied. A typical resistor provides a stable level of resistance. By contrast, a memristor can have a high level of resistance, which can be interpreted as a computer as a "1" in data terms, and a low level can be interpreted as a "0." Thus, data can be recorded and rewritten by controlling current. In a sense, a memristor is a variable resistor that, through its resistance, reflects its own history, Williams said.<br />...<br />Engineers could, for example, develop a new kind of computer memory that would supplement and eventually replace today's commonly used dynamic random access memory (D-RAM). Computers using conventional D-RAM lack the ability to retain information once they are turned off. When power is restored to a D-RAM-based computer, a slow, energy-consuming "boot-up" process is necessary to retrieve data stored on a magnetic disk required to run the system.<br />...<br /><br />in my opinion the impact of this new technology (if it will be possible to employing memristor in the current production process ... it is not easy to say that at the current stage) will be very huge. What I see now are, at least,three big important revolutions:<br /></span></span><ol><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">mobile devices (smartphones and ultra-compact notebooks) will have great advantages by the power managment and memory strage capabilities point of view.</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">The traditional memory hierarchy in the desktop computer will be completley upset. Hard drive as we know them could be disappear and the data access could be tremendously improved.</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">New kinds of configurable devices could be implemented (and as a consequences much more spatial surface for instruction level spatial computation). --> larger devices for a larger bandwith on data access.<br /></span></span></li></ol></div></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-32739286431804814592008-05-05T14:22:00.005+02:002008-05-05T14:44:37.021+02:00HP-Labs: Memristor<span style=";font-family:trebuchet ms;font-size:85%;" >Unfortunately I can't spend time to update my blog in this period, but such a breaking new is very important to report on a computing web site. I need to read some more to provide good feedback on this matter, but it seems very promising. So, </span><span style=";font-family:trebuchet ms;font-size:85%;" >in the meanwhile,</span><span style=";font-family:trebuchet ms;font-size:85%;" > have fun looking at </span><span style="font-size:85%;"><a style="font-family: trebuchet ms;" href="http://www.hpl.hp.com/news/2008/apr-jun/memristor.html"> this link.</a></span><br /><div style="text-align: justify;"><span style="font-size:85%;"><br /><a style="font-family: trebuchet ms;" href="http://www.hpl.hp.com/news/2008/apr-jun/memristor.html"> </a></span><div style="text-align: center;"><span style="font-size:85%;"><a style="font-family: trebuchet ms;" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhz37N7SswvwTEWH1fRAtPPeLU2mYX_v8TSIJ8En25HtyKDCjG6P6Ymcb8N43_mH-tlj9dDPrQp8hXTcKw4eM4KkWjdBL-5tF4FXlkOgm4KDom-zw2UXCq9wYRp_-dVihhH0rDisFRO9BI/s1600-h/memristor01.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhz37N7SswvwTEWH1fRAtPPeLU2mYX_v8TSIJ8En25HtyKDCjG6P6Ymcb8N43_mH-tlj9dDPrQp8hXTcKw4eM4KkWjdBL-5tF4FXlkOgm4KDom-zw2UXCq9wYRp_-dVihhH0rDisFRO9BI/s320/memristor01.jpg" alt="" id="BLOGGER_PHOTO_ID_5196868213541750866" border="0" /></a></span><span style=";font-family:trebuchet ms;font-size:85%;" >A nice picture on Memristor<br />...<br /></span><div style="text-align: left;"><span style=";font-family:trebuchet ms;font-size:85%;" > A brief description stolen form <a href="http://www.spectrum.ieee.org/may08/6207">here</a>. Sorry ...</span><br /><span style=";font-family:trebuchet ms;font-size:85%;" ><span style="font-family:trebuchet ms;">"</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">In 1971, a University of California, Berkeley, engineer (Leon </span></span><span style=";font-family:trebuchet ms;font-size:85%;" >Chua</span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">) predicted that there should be a fourth element: a memory resistor, or memristor. But no one knew how to build one. Now, 37 years later, electronics have finally gotten small enough to reveal the secrets of that fourth element. The memristor, Hewlett-Packard researchers revealed today in the journal </span><span class="italic" style="font-family:trebuchet ms;">Nature ..."<br /><br /></span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgl1YuRbM7u1FIdGP6ZpVDYyXElS33_UoPQ6sCf12DryhlfXl2gBxhK1ab4avHWeN3Mo4pxinFBaJ1R5tJVYxMC3g5IarV6Uv6kQWAjdD-vuRczMhNc_Lhovly_Z4l8vPfyGUxh0OI_fGc/s1600-h/meristor__.bmp"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgl1YuRbM7u1FIdGP6ZpVDYyXElS33_UoPQ6sCf12DryhlfXl2gBxhK1ab4avHWeN3Mo4pxinFBaJ1R5tJVYxMC3g5IarV6Uv6kQWAjdD-vuRczMhNc_Lhovly_Z4l8vPfyGUxh0OI_fGc/s320/meristor__.bmp" alt="" id="BLOGGER_PHOTO_ID_5196873719689824370" border="0" /></a><span style="font-size:85%;"><span class="italic" style="font-family:trebuchet ms;"><br /></span></span></div><div style="text-align: justify;"><span style=";font-family:trebuchet ms;font-size:85%;" ><span style="font-family:trebuchet ms;">" ... Chua deduced the existence of memristors from the mathematical relationships between the circuit elements. The four circuit quantities (charge, current, voltage, and magneti</span>c flux) can be related to each other in six ways. Two quantities are covered by basic physical laws, and three are covered by known circuit elements (resistor, capacitor, and inductor), says Columbia University electrical engineering professor David Vallancourt. That leaves one possible relation unaccounted for. Based on this realization, Chua proposed the memristor purely for the mathematical aesthetics of it, as a class of circuit element based on a relationship between charge and flux. "</span><br /></div><span style="font-size:85%;"><br /></span></div></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-84631130257876518712008-04-21T10:28:00.013+02:002008-04-23T16:52:05.485+02:00Spatial Computing & Compilers (2)<div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">I developed a set of <a href="http://suif.stanford.edu/">SUIF</a> passes to extract parallelism from source code of programs written in C. </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">This set of passes are constitutes a framework targeting a family of architectures called hybrid machines. See <a href="http://www.ieeexplore.ieee.org/xpl/freeabs_all.jsp?isnumber=4380602&arnumber=4380708&count=173&index=105">this</a></span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><a href="http://www.ieeexplore.ieee.org/xpl/freeabs_all.jsp?isnumber=4380602&arnumber=4380708&count=173&index=105"> paper</a> if you want to know details. Unfortunately I cannot provide implementation details, because the work has been privately funded, hen</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">ce details cannot be revealed and every detailed information which can be shared has been published by conference papers. But here I can provide some interesting hint about this matter.<br /><br /></span></span><div style="text-align: center;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxQ6Xj3mI2ei4BSZiBm28it0FuOgkHeogLg8cYZe5ECE-NcSXIUVirIDIV77qIBCLvXXrFvxB5gLobnpoxFmhlcbSSGFUvH7jTbs5dXOOA7CQovfovk6-0s3x0yeGFKuHVElugIeb8xwY/s1600-h/hy2.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxQ6Xj3mI2ei4BSZiBm28it0FuOgkHeogLg8cYZe5ECE-NcSXIUVirIDIV77qIBCLvXXrFvxB5gLobnpoxFmhlcbSSGFUvH7jTbs5dXOOA7CQovfovk6-0s3x0yeGFKuHVElugIeb8xwY/s320/hy2.jpg" alt="" id="BLOGGER_PHOTO_ID_5191955582703945794" border="0" /></a><span style="font-weight: bold;font-size:78%;" ><span style="font-family:trebuchet ms;">A hybrid architecture: interaction between the two cores is not specified. </span></span><span style="font-weight: bold;font-size:78%;" ><span style="font-family:trebuchet ms;">If concurrent cache access is exploited </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-weight: bold;font-size:78%;" >cache coherency system is required, otherwise is possible unique data access by register files (easily manageable, but probably affected by bottleneck). In general at this level of description the system architecture is not very relent. Even though it becomes important when kernel ranking and complier optimizations have to be performed.</span></span></span><br /></div></div><div style="text-align: justify;"><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><br />What I built is a set of compilation passes able to discover critical kernels of a program and translate it in a form easily manageable in a spatial domain by SSA. The bulk of the work is about kernel discovery, extraction and their representation, than translation into a spatial formalism is made by SSA as hint <a href="http://spatialcomputing.blogspot.com/2008/03/static-single-assignement.html">here</a>.</span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">What is important to mark is that programs are </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">in general </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">sequential objects and (also thanks to the processor pipeline) very few lines of code in a program are useful to spatially parallelize, even though, having enough space the performance improvement would be huge, above all in certain application domains (DSP, imaging, 3D engine ...)</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">.<br />Posts about 'code intermediate representations' to be appear.<br /><br /></span></span><br /></div><span style="font-size:85%;"><span style="font-family:trebuchet ms;"></span></span></div><span style="font-size:85%;"><span style="font-family:trebuchet ms;"></span></span>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com1tag:blogger.com,1999:blog-5431337036958656574.post-85252371739865735162008-04-10T12:56:00.011+02:002008-04-14T17:40:58.719+02:00Software Simulation Techniques<span style="font-family:trebuchet ms;"><span style="font-size:85%;">If you desire to simulate a system you have to take care of all its components and </span></span><span style="font-family:trebuchet ms;"><span style="font-size:85%;">the way they interact. The main and more difficult element to manage is time. Every </span></span><span style="font-family:trebuchet ms;"><span style="font-size:85%;">action happened at a precise time and at one or more action can happen at the same time.</span></span><br /><div style="text-align: justify;"><span style="font-family:trebuchet ms;"><span style="font-size:85%;">We consider time as a continuous variable: in software,as every thing, time is discrete. </span></span><span style="font-family:trebuchet ms;"><span style="font-size:85%;">So in function of your time resolution (i.e. clock resolution) you can find some actions that are performed as effe</span></span><span style="font-family:trebuchet ms;"><span style="font-size:85%;">ct (i.e. after) of other events but at the same time (by the clock point of view). This is the "execution pattern" at the base of the concept of "evaluate&update" or, for who knows System-C, "Delta-Cycle".<br />A system simulation can be cycle-accurate or non deterministic, or something else comprised between these types.<br />Cycle accuracy requires every part of the system to be completely described and has a completely deterministic behavior.</span></span><br /><span style="font-family:trebuchet ms;"><span style="font-size:85%;">Non deterministic simulation usually exploits concurrent thread to simulate the concurrency of different systems elements and is useful because does not require precise system specifications even if it provides unstable results. They are useful during system analysis & design stages, but are dangerous if employed to define the specifications of a project.</span></span><br /><span style="font-family:trebuchet ms;"><span style="font-size:85%;">What I previously defined "something else" are simulations with clock, but without a completely specified system and as a consequence has some "uncertainty kernels" where concurrent action are performed by a non-deterministic choice. BME is a "something-else" simulator. Improving the system specification make the simulator nearer and nearer to a "cycle-accurate" one.<br />NOTE: non-determinism is usually due to multi-threading.<br /><br /></span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6ACVQvD8x46FpKg-m7ArarAhdj6_QcSvkQRY-JwAwgRiz-yzlQk3VQAQzJv66IXJjICnebikdAL4ZCT6wDSgYbc4f-nWbkkJhlnCxzIhlGs7jts56JVs2OgFX_3NEpBc97xOEOlaKg44/s1600-h/sim_tech_bme.JPG"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6ACVQvD8x46FpKg-m7ArarAhdj6_QcSvkQRY-JwAwgRiz-yzlQk3VQAQzJv66IXJjICnebikdAL4ZCT6wDSgYbc4f-nWbkkJhlnCxzIhlGs7jts56JVs2OgFX_3NEpBc97xOEOlaKg44/s320/sim_tech_bme.JPG" alt="" id="BLOGGER_PHOTO_ID_5188409490758481170" border="0" /></a><span style="font-family:trebuchet ms;"><span style="font-size:85%;">This image represents computation time (probability) density functions for the same task executed on different architectural instances by BME. How this times are evaluated is not trivial: see this <a href="http://www.google.it/url?sa=t&ct=res&cd=1&url=http%3A%2F%2Fportal.acm.org%2Fft_gateway.cfm%3Fid%3D1102314%26type%3Dpdf&ei=CuwASMaXAaiu-QKn3umMBw&usg=AFQjCNEyDW8bPat-Wgn59CRqyf8eBwc4Tg&sig2=xh6exnMVGaVY6yp9agafqA">paper</a> for a exhaustive explanation. BME is designed to simulate more systems and if the architecture is not completely specified the execution engine exploits non-determinism to execute actions comprised in a time-step. i.e. it cannot exploits the concept of delta-cycle.</span></span><br /></div><span style="font-family:trebuchet ms;"><br /></span>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com1tag:blogger.com,1999:blog-5431337036958656574.post-26496989247555112992008-04-02T15:28:00.014+02:002010-08-22T23:56:14.060+02:00Evolutionary Computation<div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">In the last post I spoke of evolutionary computation. I gave some hint to the environment trying to mark it should have some role, but before describing the environment some more information about evolutionary computation must be provided. I will try to give a brief description.</span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-weight: bold; color: rgb(51, 255, 51);">Evolution </span>implies the existence of more generations of "objects". The objects of the same generation by means of two operators,</span></span><br /><ul><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">recombination [<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4HbJZIGl2IWUyywzZG1c3RUDyXVN2BMHaJfID7CAyF4mjqNsSkVqr8o7OuhSNmClfOSdV1LS56CEFmfAp4w5DuOnQbq8a8rUQ2pSjlGtyZ4WHynH_QQ1vYX5ZgMYqgQX_Vik2IOKYqpI/s1600-h/subtree_co.jpg">see picture</a>]</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">mutation</span></span></li></ul><span style="font-size:85%;"><span style="font-family:trebuchet ms;">create a set of new "objects": the <span style="color: rgb(51, 255, 51);">offspring</span>.</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> By means of the operator of "offspring-selection&replacement" applied to both offspring and parents a new generation is obtained.<br />Evolutionary programming exploits evolution to make evolve programs. </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">I apologize for this raw description and for the lack of hint to <a href="http://www.genetic-programming.org/">genetic programming</a>: I want to be as simple as possible and I cannot write everything about everything ... ^_^".</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><br />In general, evolutionary programming is managed with a top-down approach: a main program create an <span style="color: rgb(51, 255, 51);">environment </span>( ... hmm interesting) and manages the generations (in this case the objects are "programs") </span></span><span style="color: rgb(255, 153, 0);font-size:85%;" ><span style="font-family:trebuchet ms;"><span style="color: rgb(255, 0, 0);">sequentially !</span><span style="color: rgb(0, 0, 0);"><br /></span></span></span><div style="text-align: center;"><span style="color: rgb(255, 153, 0);font-size:85%;" ><span style="font-family:trebuchet ms;"><span style="color: rgb(0, 0, 0);"><span style="color: rgb(255, 255, 255);">Look @ </span><a href="http://darwin.sourceforge.net/moth/index.html">Darwin </a><span style="color: rgb(255, 255, 255);">if you want write some simple evolutionary application (note: I have not written evolutionary programming).</span></span></span></span><br /></div><span style="color: rgb(255, 153, 0);font-size:85%;" ><span style="font-family:trebuchet ms;"><span style="color: rgb(0, 0, 0);"><span style="color: rgb(204, 204, 204);">A different approach could be to exploit an environment where evolution is part or a characteristics of it and becomes an </span><span style="color: rgb(51, 255, 51);">operator </span><span style="color: rgb(204, 204, 204);">which subsets of involved objects can independently apply to themselves. This approach implies an asynchronous and unregulated evolution of the entire set of objects and requires high level of distribution and good management of parallelism in the underlying sub-system: some more clues in the next posts.</span></span></span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><br /></span></span></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-84844701960174631942008-03-28T13:33:00.013+01:002008-03-30T10:46:23.433+02:00Self-Replication & Philogenesis<div style="text-align: justify;"><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >We have already explained how the software is able to replicate itself: it is possible to write a program which can </span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >return its own code. We have hint to the "universality<span style="font-family:trebuchet ms;">" of </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">all conventional programming languages</span></span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" ><span style="font-family:trebuchet ms;">, i.e</span>. it is possible to write </span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >a program in the language A which is able to execute programs written in A. I apologize for this raw summing up. </span><br /></div><div style="text-align: justify;"><div style="text-align: justify;"><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >So we know write programs able to "understand" themselves : is this a primitive form of self-consciousness? </span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >I will definetly not speak of this kind of issues, even if I will probably provide some about "Computability & Unsolvability" ...</span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >try google such a string ... something about Martin Davis ? ... :) </span><br /><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >Well we can write a program (we call this program A_INT) able to produce a program (itself) as output and potentially </span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >able to predict the behaviour of it on a (small?) set of input. May be A_INT can improve some its own function!</span><br /><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >Probably yes ... May this a base principle for a real digital life? ... We should ask it to an expert. Probably </span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" ><a href="http://www.cs.ucl.ac.uk/staff/P.Bentley/">Peter Bently</a> or <a href="http://ti.arc.nasa.gov/people/hornby/">Gregory Hornby</a></span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" > can provide more significant feedbacks on this matter.</span><br /><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >A_INT can dynamically define a sort of fitness function by results obtained on the input or by the a way the results are obtained (e.g. how long a computation takes) ... Wow!! we are embedding an evolutionary computation in A_INT: so we have obtained a self-replicating <span style="font-style: italic;">evolutionary </span>interpreter. </span><br /><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >Scientific literature is full of works on evolution applied to SW, the best example (and the base of thousands of other works), in my </span><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >opinion are the works on evolutionary programming by <a href="http://www.genetic-programming.com/johnkoza.html">John Koza.</a></span><br /><span id="qna0" style=";font-family:Trebuchet MS;font-size:85%;" >The follwing picture shows the main concept at the base of all Koza works: code <span style="color: rgb(51, 255, 51);">recombination </span>on tree representing a program (i.e. a graphical representation of functional programming) and consequent <span style="color: rgb(51, 255, 51);">offspring generation</span>.</span><br /><br /><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4HbJZIGl2IWUyywzZG1c3RUDyXVN2BMHaJfID7CAyF4mjqNsSkVqr8o7OuhSNmClfOSdV1LS56CEFmfAp4w5DuOnQbq8a8rUQ2pSjlGtyZ4WHynH_QQ1vYX5ZgMYqgQX_Vik2IOKYqpI/s1600-h/subtree_co.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4HbJZIGl2IWUyywzZG1c3RUDyXVN2BMHaJfID7CAyF4mjqNsSkVqr8o7OuhSNmClfOSdV1LS56CEFmfAp4w5DuOnQbq8a8rUQ2pSjlGtyZ4WHynH_QQ1vYX5ZgMYqgQX_Vik2IOKYqpI/s320/subtree_co.jpg" alt="" id="BLOGGER_PHOTO_ID_5182770926185410082" border="0" /></a><span style="font-size:85%;"><span style="font-family:georgia;">.<span style="font-family:trebuchet ms;">.. hmmm I think there is something wrong ... what about the role of the environment? ... and, first of all, what is the environment ? </span></span></span><br /></div><br /></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com1tag:blogger.com,1999:blog-5431337036958656574.post-53943524761358922792008-03-25T10:34:00.005+01:002008-03-27T09:11:31.421+01:00a nano-hub lesson on nanoelectronics<span style="font-size:85%;"><span style="font-family:trebuchet ms;">Watch this interesting lesson ... </span></span><br /><div style="text-align: center;"><iframe allowfullscreen='allowfullscreen' webkitallowfullscreen='webkitallowfullscreen' mozallowfullscreen='mozallowfullscreen' width='320' height='266' src='https://www.blogger.com/video.g?token=AD6v5dxcq_9kcZsfbFMHaps51g3QOPEXp2CLsUb9z9dFzEfbbv9ZCxTla1UqIIUNpYrlxCAJM1odSyeBCKDJPGRWng' class='b-hbp-video b-uploaded' frameborder='0'></iframe></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com1tag:blogger.com,1999:blog-5431337036958656574.post-5929349406829003942008-03-23T13:09:00.010+01:002008-03-27T16:15:28.077+01:00Software & Hardware ... are they really different?<div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">A compiler is a program which takes a source program as input and produce an equivalent source program in a different language. In general a compiler translates a source program in </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">low level representation which a CPU with a precise architecture can execute. This is the main reason a compiler is often related with a particular <span style="color: rgb(51, 255, 51);">target architecture</span>.</span></span><br /></div><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">The code life-cycle starts from the high level representation (typically written by a sw developer) up to a low level representation, often a bit string standing for a sequence of a assembly instructions. If the target architecture is a configurable hardware, the bit string is a device configuration aimed to the mapping of a circuit on the programmable device. In next posts I will provide more hints about code processing.<br />An Interpreter is a program able to execute another program implemented with a particular language: it is possible to write interpreter for the same language used to implement it (<a href="http://en.wikipedia.org/wiki/Universal_Turing_machine">here </a>the most popular example :)). In general it is </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">easier </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">to write an interpreter than to write a compiler, because a , in general, comprises different layers for different optimizations ... more or less architecture dependent.<br /></span></span><ul><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">See <a href="http://www.antlr.org/">this site</a> if you want to practice with interpreters implementation.</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">See <a href="http://suif.stanford.edu/">this site</a> if you want to learn something more about compilers </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">implementation</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">.</span></span></li></ul><span style="font-size:85%;"><span style="font-family:trebuchet ms;">In next posts I will provide more links about some useful textbook on this issues, even if they require a good background in computer science and sw engineering.<br /><br /></span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">The greatest part of people thinks the HW is an electronic device able to execute a SW program and only "expert" people know HW and SW are the same things ... a processor is simply an electronic implementation of an interpreter for a particular language: its own assembly language which is often known as ISA (instruction set architecture).<br />In general a standard CPU has a Data-path+Ctrl architecture and a hierarchical set of memory system: register file -> cache(s) -> RAM -> disk (virtual memory) . See below the MIPs arhcitecture, it is usually shown in the computational architectrues courses:</span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEih68UpV7ZosGiET2S81JYnx7xun_v0cGWqG1bceB3H1enF0h7oiq0a1Z-XUo-TpuBFF0r4qgCqTxxJzk3tca1yitcuLIt8U9mlZgc01xmuWaCnYU-p9jotUl2xlU9l5mPmrwLb6r7e9lY/s1600-h/mips.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEih68UpV7ZosGiET2S81JYnx7xun_v0cGWqG1bceB3H1enF0h7oiq0a1Z-XUo-TpuBFF0r4qgCqTxxJzk3tca1yitcuLIt8U9mlZgc01xmuWaCnYU-p9jotUl2xlU9l5mPmrwLb6r7e9lY/s320/mips.jpg" alt="" id="BLOGGER_PHOTO_ID_5180919876885212690" border="0" /></a><span style="font-size:85%;"><span style="font-family:trebuchet ms;">A programmable device executes a task after having acquired a configuration which implements a particular architecture: "general purpose" or more frequently "application specific". The configuration has a SW description by VHDL or verilog language written by hand or returned by a compiler. This issue will be analysed much better in next posts now what is interesting to mark is that a configuration stands for a piece of HW mapped on a configurable layer.<br />In the next generation architectures probably we will see packages of configuration sw libraries provided with CPUs coupled/interfaced with a configurable core: the CPU generations will be tagged by both HW specification (as the production process technology) and improvement of configuration libraries. Look at <a href="http://download.intel.com/technology/platforms/quickassist/quickassist_aal_whitepaper.pdf">this.</a><br /><br /></span></span></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-29853833917148238542008-03-19T10:43:00.015+01:002008-03-27T16:19:05.357+01:00Cellular architectures<span style="font-size:85%;"><span style="font-family:trebuchet ms;">What kind of computational system can be classified as cellular architecture? Such a question is apparently simple:</span></span><br /><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">"a system able to perform a computation by means of a set of simple processing elements" is an certainly an exhaustive and comprehensive definition but, hides a lot of complex issues we will have to analyse deeply.</span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">What about the PE computational capability? What kind of interconnection? How fast the communication is? What about communication protocol? Is there a PE synchronization signal? ... These are all important questions to which is necessary provide a precise answer and, each of them, is very complex.</span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">If you desire to define a cellular architecture you have to ask yourself: "What such a system have to compute? "<br />Cellular computation has great advantages and disadvantage:</span></span><ul><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">it is potentially fault tolerant,</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">provide a computational surface suitable to power management,</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">the more simpler and denser is the more a computation can explicit its own intrinsic parallelism,</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">parallelism does not always implies efficiency : cellular computation imposes parallelism,</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">communication is difficult to manges , above all if it is required reliable,</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">traditional programming languages (both structural and OO) does not explicit parallelism,</span></span></li><li><span style="font-size:85%;"><span style="font-family:trebuchet ms;">hence compilers require much more intelligences.</span></span></li></ul><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">A "computer science" point of view about cellular computation can be found <a href="http://www.moshesipper.com/">in this book .</a></span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">An innovative kind of dynamically reconfigurable PE array is illustrated <a href="http://www.cellmatrix.com/">here </a>and it has strongly influenced my research work.</span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">Here below three different SW simulati</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">ons (by means of <a href="http://spatialcomputing.blogspot.com/2008/03/bme-graphic-user-interface-spatial.html">BME simulator</a>) of a </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">DCT </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">cellular computation are shown:</span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg94IUNcnx84NMl4qr8yCFDafiKk06hu9myjSNNwSAk5x6cMbxN4KlXKUcHQm-88DeX7QPzjxvNIMClk4kN2ABGzJSbpC7LAXpbdTciHiU1-FoikNXith_fvx1EEMPKSL6iHnxYjLRoty4/s1600-h/dct-cc.JPG"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg94IUNcnx84NMl4qr8yCFDafiKk06hu9myjSNNwSAk5x6cMbxN4KlXKUcHQm-88DeX7QPzjxvNIMClk4kN2ABGzJSbpC7LAXpbdTciHiU1-FoikNXith_fvx1EEMPKSL6iHnxYjLRoty4/s320/dct-cc.JPG" alt="" id="BLOGGER_PHOTO_ID_5179440621728837826" border="0" /></a><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1vZcSCiYQsa_99uTWyAc7gRz1vv1-kGAJmhoKQ04FUtWcMag_yeu_xr8mD2XhoDRGm8K5LP4T2TftOV54JJsc59rorss6qC4heS7sz8kwXqdFznA2ts_jrGpq6TnR7x59w6Y85XIGM8I/s1600-h/dct-formula.JPG"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi1vZcSCiYQsa_99uTWyAc7gRz1vv1-kGAJmhoKQ04FUtWcMag_yeu_xr8mD2XhoDRGm8K5LP4T2TftOV54JJsc59rorss6qC4heS7sz8kwXqdFznA2ts_jrGpq6TnR7x59w6Y85XIGM8I/s320/dct-formula.JPG" alt="" id="BLOGGER_PHOTO_ID_5179440737692954834" border="0" /></a><br /></div><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><br /></span></span></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-9614242700296705562008-03-18T09:20:00.023+01:002008-03-18T23:24:16.005+01:00Static Single Assignment<div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">Static Single Assignment (SSA) is an intermediate representation (IR) used in the compilation flow of advanced optimizing compliers. It covers the semantic gap between imperative paradigm typical of procedural languages and data flow representation suitable to translate code in a high level circuit representation. </span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">In SSA a variable is a <span style="color: rgb(51, 255, 51);">value </span>and not a memory address as usual. So an assignment to a variable can be made just one time. Look at the following example:</span></span><br /></div><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><br /><span style="color: rgb(51, 204, 0);font-family:courier new;" > if (cond){</span><br /><span style="color: rgb(51, 204, 0);font-family:courier new;" > a = (b*d) >> 2;</span><span style="color: rgb(51, 204, 0);"> </span><span style="color: rgb(51, 204, 0);font-family:courier new;" ><br />}</span><br /> <span style="color: rgb(51, 204, 0);font-family:courier new;" >else{</span><br /> <span style="color: rgb(51, 204, 0);font-family:courier new;" >a--;</span><br /> <span style="color: rgb(51, 204, 0);font-family:courier new;" >}</span><br /> <span style="font-family:courier new;"><span style="color: rgb(51, 204, 0);">return a;</span><br /><br /><span style="font-family:trebuchet ms;">The SSA form will be:</span><br /><br /><span style="color: rgb(51, 204, 0);"> if (cond){</span><br /><span style="color: rgb(51, 204, 0);"> a0 = (b*d) >> 2;</span></span></span></span><br /><span style="color: rgb(51, 204, 0);font-size:85%;" ><span style="font-family:trebuchet ms;"><span style="font-family:courier new;"> }<br />else{<br /> a1 = a--;<br />}<br />a2 = PHI(a0,a1);<br />return a2;</span></span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-family:courier new;"><span style="font-family:trebuchet ms;"><br /></span></span></span></span><br /><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-family:courier new;"><span style="font-family:trebuchet ms;">Obliviously, the "cond" evaluation is redundant! But the SSA </span></span></span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-family:courier new;"><span style="font-family:trebuchet ms;">representation </span></span></span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-family:courier new;"><span style="font-family:trebuchet ms;">(... well the form </span></span></span><span style="font-family:trebuchet ms;">described in </span><cite style="font-family: trebuchet ms;">R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and K. Zadeck. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Transactions on Programming Languages and Systems, 13(4):451-490, October 1991</cite></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-family:courier new;"><span style="font-family:trebuchet ms;">) does not modify the control structures and adds a Phi node at the end of the computation flow to select the right value to return.</span></span></span></span><br /></div><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-family:courier new;"><span style="font-family:trebuchet ms;">Such a form can be employed to perform speculation aimed to spatial deployment of computation i.e. circuit synthesis by exploiting PHI node as a multiplexor to select the output in function of the condition value. Look at the following data flow graph:</span></span></span></span><br /></div><div style="text-align: justify;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKhPK4T9WFviu4VyAI9qmXkJ4_O_UXsu64C1vkJFQR8UI8rOEDI2AzLNtEHNzRV0ETAcLl9NeVQnIyhR9FnuEzaPf1xyCSkUdcFRY16JvLAmiGp1RKyF4_ezmCf2ffMtfe4RfctcpQIcY/s1600-h/ssa_df.JPG"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKhPK4T9WFviu4VyAI9qmXkJ4_O_UXsu64C1vkJFQR8UI8rOEDI2AzLNtEHNzRV0ETAcLl9NeVQnIyhR9FnuEzaPf1xyCSkUdcFRY16JvLAmiGp1RKyF4_ezmCf2ffMtfe4RfctcpQIcY/s320/ssa_df.JPG" alt="" id="BLOGGER_PHOTO_ID_5179026084370329778" border="0" /></a><span style="font-size:85%;"><span style="font-family:trebuchet ms;">It easy to understand as the SSA is an explicit definition of the data-flow graph. The optimization is marked by the picture: "cond", the operators "-","*", ">>" may potentially executed in parallel</span></span>. <span style="font-size:85%;"><span style="font-family:trebuchet ms;">How much this parallel execution depends on the complexity of "cond" evaluation, anyway <span style="color: rgb(51, 255, 51);">the gain in time is provided by more logic (e.g. space)</span> exploited by implementing the circuit.</span></span><br /></div><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="font-family:courier new;"><span style="font-family:trebuchet ms;"><br /></span></span> </span></span>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-14695940221520492182008-03-15T17:28:00.004+01:002008-03-17T15:17:42.879+01:00an hint to the next generation computational architectures<span style="font-size:85%;"><span style="font-family:trebuchet ms;">Computational architectures are loosing their traditional structure. Nowadays there is no CPUs with a single core</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">, a</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> <span style="color: rgb(51, 255, 51);">CPU has </span></span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="color: rgb(51, 255, 51);">definetly </span></span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"><span style="color: rgb(51, 255, 51);">become a computing system</span>. </span> <span style="font-family:trebuchet ms;">The more technology advances the less computational structures (in its elementary devices and their interconnections) can be complex. </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">An interesting approach to HW automatic synthesis at nano-metric scale are </span><a style="font-family: trebuchet ms;" href="http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/dac/2003/2394/00/2394toc.xml&DOI=10.1109/DAC.2003.1219125">regular fabrics</a><span style="font-family:trebuchet ms;"> that could represent, in the next future, the right way to build complex architectures by exploiting regular interconnection among homogeneous logic blocks . Such an approach is very similar to the concept at the base of programmable logic.</span></span><br /><div style="text-align: justify;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">FPGA are very interesting by the research point of view and well exploited in some critical applications, above all in the telecom domain. A post completely dedicated to the FPGA will be written in the next future.</span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEwfE5Ry_Z2hOCgIKxjY-SdP8dKCpLMbDu2fTUjWy7Q04yRFgPtilYwQNqhAg9b5AIy0UmZvwwII291TG_HXXkH7YLlqZtgZXwrjwXGkmidtv6shCrMeoG8LqR5Og1_TOHBpTreUB8g1s/s1600-h/fpga.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 203px; height: 147px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEwfE5Ry_Z2hOCgIKxjY-SdP8dKCpLMbDu2fTUjWy7Q04yRFgPtilYwQNqhAg9b5AIy0UmZvwwII291TG_HXXkH7YLlqZtgZXwrjwXGkmidtv6shCrMeoG8LqR5Og1_TOHBpTreUB8g1s/s320/fpga.jpg" alt="" id="BLOGGER_PHOTO_ID_5178625660274356370" border="0" /></a><span style="font-size:85%;"><span style="font-family:trebuchet ms;">I personally think that every computational architectures will be equipped with a programmable core </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">in the future</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">, because this is the only way to "respect" the Moore law and to build faster and faster processors.</span></span><br /><span style="font-size:85%;"><span style="font-family:trebuchet ms;">Since</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> "yesterday", ... :) ... </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">in my opinion </span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;">the revolution has already</span></span><span style="font-size:85%;"><span style="font-family:trebuchet ms;"> begun( see </span><a style="font-family: trebuchet ms;" href="http://techresearch.intel.com/articles/Tera-Scale/1421.htm">terascale</a><span style="font-family:trebuchet ms;">,</span><a style="font-family: trebuchet ms;" href="http://en.wikipedia.org/wiki/Cell_microprocessor">cell processor</a><span style="font-family:trebuchet ms;">, </span><a style="font-family: trebuchet ms;" href="http://download.intel.com/technology/platforms/quickassist/quickassist_aal_whitepaper.pdf">IAAL</a><span style="font-family:trebuchet ms;">) ... , processor performance gain was provided by component miniaturization which granted a shorter gate delay. Wire delay, instead, has always remained at the same level causing great difficulties in communication and synchronization (by a consistent clock signal on the chip surface). The physical problems due to high resolution photolitography will by described in next posts ...<br /></span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeiYzHVXEOCOBIaBteqaLHv_yGspBAuHdHTRZHpvEmjxLIumVwFUWa-gasp_NCDWJUUuc8faFzoOHw8pgU3AzHVruCDtttFFVSpx3ves33DQtAQXiwaDsCrfKgBrc7JJkzHEFFU1li_3g/s1600-h/gdvswd.JPG"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeiYzHVXEOCOBIaBteqaLHv_yGspBAuHdHTRZHpvEmjxLIumVwFUWa-gasp_NCDWJUUuc8faFzoOHw8pgU3AzHVruCDtttFFVSpx3ves33DQtAQXiwaDsCrfKgBrc7JJkzHEFFU1li_3g/s320/gdvswd.JPG" alt="" id="BLOGGER_PHOTO_ID_5178707934667878562" border="0" /></a><span style="font-size:85%;"><span style="font-family:trebuchet ms;">On the other hand, today, the performance improvement is reached by evolution of architectural paradigm: multi-cores, DSP array, and finally reconfigurable array.</span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMko4jzPVMBhH7hsUbu8wL_YCmMMxm0_503lpdYUO4foxishYWBDD4Tx2xhdwXw1HUiV7kCkbU1tMrPWgwYqVlnbbjn-64OhvjhSagpZGOc0sA62o57axt0otmvIpeqOUhjDwTkDEVlE0/s1600-h/Cell_Broadband_Engine_Processor.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 205px; height: 281px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMko4jzPVMBhH7hsUbu8wL_YCmMMxm0_503lpdYUO4foxishYWBDD4Tx2xhdwXw1HUiV7kCkbU1tMrPWgwYqVlnbbjn-64OhvjhSagpZGOc0sA62o57axt0otmvIpeqOUhjDwTkDEVlE0/s320/Cell_Broadband_Engine_Processor.jpg" alt="" id="BLOGGER_PHOTO_ID_5178019623209021570" border="0" /></a></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-34386823292930708652008-03-13T11:14:00.000+01:002008-03-17T10:15:43.200+01:00self replication in computer science<span style="font-family:trebuchet ms;"><span style="font-size:85%;">The following program (C language) prints on stdout its own source code:<br /><br /><span style="font-family:courier new;">char * f = "char * f=%c%s%c;%cint main(){%c printf(f,34,f,34,10,10,10,10);%c}%c";</span><br /><span style="font-family:courier new;">int main(){</span><br /><span style="font-family:courier new;"> printf(f,34,f,34,10,10,10,10);</span><br /><span style="font-family:courier new;">}</span><br /><br />It is one of the funniest thing a program can do:<span style="color: rgb(255, 0, 0);"> replicate itself</span>!<br />Self replication is one of the most interesting features in the computer science: it intrigues researcher in computer science since the born of this discipline. Computer viruses often replicate their own code to spread themselves on different computers.<br />The most famous application of the self-replication in the theoretical computer science is the universal </span></span><span style="font-family:trebuchet ms;"><span style="font-size:85%;">constructor </span></span><span style="font-family:trebuchet ms;"><span style="font-size:85%;">by Von Neumann (look at <a href="http://en.wikipedia.org/wiki/Von_Neumann_universal_constructor">this link)</a>. Here below a fine picture of the universal constructor while is replicating itself:<br /><br /></span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjw4bQbevixZcBqrEhwlgYnSuHBpqJsQ8F1yzrqp3bHt_4frXPXWNvA0Ih6Nr9vRcmX2_BiDnQG1fVr8gvzdldAHC3q7he60zIVHdrB8JaLNOx3XHhsl91rl-y8dY0xUisfco_oyF0wtpY/s1600-h/vonneumanuc.gif"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjw4bQbevixZcBqrEhwlgYnSuHBpqJsQ8F1yzrqp3bHt_4frXPXWNvA0Ih6Nr9vRcmX2_BiDnQG1fVr8gvzdldAHC3q7he60zIVHdrB8JaLNOx3XHhsl91rl-y8dY0xUisfco_oyF0wtpY/s320/vonneumanuc.gif" alt="" id="BLOGGER_PHOTO_ID_5177183259932504178" border="0" /></a><span style="font-family:trebuchet ms;"><span style="font-size:85%;"><br /></span></span><span style="font-family:trebuchet ms;"><span style="font-size:85%;"><span style="color: rgb(51, 255, 51);">Embryonics </span>and the related <span style="color: rgb(51, 255, 51);">POEtic project </span>are very interesting projects which aim to exploit self-replication to compute and manage faults in HW and consequently in the SW mapped on the (bio-inspired) computational layer. It is described <a href="http://lslwww.epfl.ch/pages/embryonics/">here</a>.</span></span><br /><span style="font-family:trebuchet ms;"><span style="font-size:85%;"><br /><br /></span></span>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-46907658203535658792008-03-12T15:35:00.000+01:002008-03-17T10:18:05.217+01:00spatial computation & compilers<div style="text-align: center;"><div style="text-align: left;"><span style="font-size:85%;">Here a couple of pictures representing the two main issues of my research work.<br /></span><ol><li><span style="font-size:85%;">A simulation environment to study <span style="font-weight: bold; color: rgb(51, 255, 51);">spatial deployment</span><span style="font-weight: bold; color: rgb(51, 255, 51);"> </span>of computation and PE (processing elements) reconfiguration dynamics.<br /></span></li><li><span style="font-size:85%;">A compilation framework able to extract<span style="font-weight: bold;"> </span><span style="color: rgb(51, 255, 51);">critical computational kernels</span><span style="font-weight: bold;"> </span>suitable to be executed in parallel. All compilation analysis and optimizations are mainly based on structural analysis and spatial execution is strongly <span style="font-weight: bold; color: rgb(51, 255, 51);">ILP </span>oriented.</span><br /></li></ol></div><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP6cQx48C2kOnPysdBMv2Zz8wZY3XF5q-NEucJB17kESq2JVCMGpQcOXT3hUhEcb3VIB08qLQ91FWYaYf-qVaZgK1N_8yfDmGJDwp3I9G0UfSpq4SQGIaQq3TbIOwYt5ApqCEkElm7mT4/s1600-h/bme.JPG"><img style="cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP6cQx48C2kOnPysdBMv2Zz8wZY3XF5q-NEucJB17kESq2JVCMGpQcOXT3hUhEcb3VIB08qLQ91FWYaYf-qVaZgK1N_8yfDmGJDwp3I9G0UfSpq4SQGIaQq3TbIOwYt5ApqCEkElm7mT4/s320/bme.JPG" alt="" id="BLOGGER_PHOTO_ID_5176864414445350978" border="0" /></a><br /><span style="font-size:78%;"><span style="font-weight: bold;"> BME graphic user interface<br /><br /></span></span><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYNEZVtbgE-bhRLy3GFe8gOlS7eQTUwDVnVz1ZrcOJ4Y5AkhWcWlFZ5zBjbjWYgMm15H_oAU9edOj_PSKGAOwoReoqAIwgIIgBolW2I5g1tBkEuaD9pOJFx0JDGyEfc4bq7qgar8iQY6w/s1600-h/HCL-ASCL.JPG"><img style="cursor: pointer;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYNEZVtbgE-bhRLy3GFe8gOlS7eQTUwDVnVz1ZrcOJ4Y5AkhWcWlFZ5zBjbjWYgMm15H_oAU9edOj_PSKGAOwoReoqAIwgIIgBolW2I5g1tBkEuaD9pOJFx0JDGyEfc4bq7qgar8iQY6w/s320/HCL-ASCL.JPG" alt="" id="BLOGGER_PHOTO_ID_5176865127409922130" border="0" /></a><br /><span style="font-size:78%;"><span style="font-weight: bold;">Spatial-Computation compiler: code life-cycle<br /><br /></span></span><div style="text-align: justify;"><span style="font-size:85%;">The next posts will describe better this themes. Some more technical details can be found in my conference papers. In the future I will post useful and interesting references related with these theme to show where technology (by both electronics and computer science point of view) is going.<br /><br /></span></div></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com0tag:blogger.com,1999:blog-5431337036958656574.post-60249919697980427912008-03-12T09:06:00.001+01:002009-09-18T07:20:44.073+02:00int main (int argc, char ** argv) { ...<span style=";font-family:trebuchet ms;font-size:85%;" >This blog will describe all what I consider interesting and useful to know about computer science. I will speak of theoric computer science, technology impact on computation and formalisms to define it, security issues, embedded systems, software engineering (of both process and structure), OO programming.<br />I just want to publish stuff I like and I working on without pretending to create a managed discussion on particular themes ... so let's start ... I will begin by posting the abstract of my phd dissertation.<br />It should be suitable to describe enough about me .<br /></span><p class="western" style="text-indent: 0.1in; margin-bottom: 0in; font-family: trebuchet ms;" align="justify" lang="it-IT"><br /></p><div style="text-align: left;font-family:trebuchet ms;"><span style="color: rgb(51, 255, 51);" lang="en-US"><b>Computational and Programming Models for Molecular-scale devices based machines.</b></span><span style="font-style: italic;"><br /><span style="font-size:85%;"><br /></span><span style="color: rgb(102, 255, 153);font-size:85%;" >Alberto Gallini ,Ph. D. dissertation - abstract</span></span><span style="font-size:85%;"><span style="font-style: italic;"><br /></span></span><span lang="en-US" style="font-size:85%;"><span style="font-style: italic;"></span></span><br /><br /> <p class="MsoNormal" style="text-align: justify; text-indent: 7.1pt;font-family:trebuchet ms;"><span style="font-size:85%;">The planar lithography process exploited to build microprocessors on silicon has reached very high resolutions: nowadays commercial processors from Intel are produced with a 65-nm (index for the transistor size) technology and devices based on a 45-nm technology have been already announced and well established into the road map.<o:p></o:p><br /></span></p><p class="MsoNormal" style="text-align: justify; text-indent: 7.1pt;font-family:trebuchet ms;"><span style="font-size:85%;">In spite of the improvement in terms of velocity and computational capability led by the advancing of production processes, the number and the restrictiveness of constraints imposed on the designers by such an extreme miniaturization is fast arising and leading to deep changes on the architectural paradigms.<o:p></o:p><br /></span></p><p class="MsoNormal" style="text-align: justify; text-indent: 7.1pt;font-family:trebuchet ms;"><span style="font-size:85%;">“Massive distribution”, “asynchronous communication”, “fine grain parallelism”, “dynamic configuration” are becoming more an more important keywords in both technology and computational architectures research fields making these years a very exciting period in computer science. Such a scenario moves research to the exploration of new, non conventional computational paradigms for both architectures and computational models aimed the management and efficiency of machines characterized by high components interaction rate, heterogeneity and unreliability.</span></p><p class="MsoNormal" style="text-align: justify; text-indent: 7.1pt;font-family:trebuchet ms;"><span style="font-size:85%;"><span style="font-family:trebuchet ms;">In this dissertation an “horizontal approach” is employed by analyzing recent technology promises and facing both computational paradigm issues and programming model challenges. After an exploration of the technology advancement, an investigation about the impact on the computational models is carried out. A software environment for distributed systems simulation is described and a particular attention is given to the bio-inspired approaches for distribution of computation, application deployment and unreliability management. New architectural paradigms imply new compilation techniques to exploit well known high level languages: the introduction of new high level programming primitives has been always rejected by programmers, hence more intelligence has to be added to compilers. This issue is faced by introducing a compiler framework aimed to extract application parallelism and translate it into a representation suitable to target future nano-scale devices based machines.</span><br /></span></p></div>Alberto Gallinihttp://www.blogger.com/profile/16935805616874075477noreply@blogger.com1