分类目录归档:学术

Supervised Learning

In supervised learning, every example in our dataset was labelled, which means the ‘correct answer’ has been told to the computer. Therefore, the algorithm will predict the results following the dataset has been given. If you want to predict a target value, you need to look into the supervised learning algorithm. In supervised learning, there are two main tasks of the algorithm which are classification and regression problem. For classification problem, the prediction output is discrete value. For regression problem, the prediction output is a continuous value. This is the main difference between these two problems [1].

If supervised learning is decided to use for classification or regression problem, the next step is to train the algorithm and let it learn from the dataset. The data to use training the algorithm is called training set. A training set is a set of many training examples which have the different features and one labelled value (target variable) [1]. In general, the total data set is divided into three parts which are training set, cross validation set and test set. Each of these occupies 60%, 20% and 20% of the total data set size, respectively. How to use these three sets will be discussed later. The learning algorithm uses these training examples to get the hypothesis which is result function. This function can take a new data as the input to predict the output. In Supervised learning area, there are many different powerful algorithms but they cannot be all in this report. In this part, the linear regression, logistic regression, neural networks and Support Vector Machines (SVMs) are introduced as follow.

The linear regression model is a simple and easy model in the supervised learning algorithm. The hypothesis model of linear regression is $$ h=Theta_0+theta_1*X_1+…+theta_n*X_n$$ where $X_n$ is the value of feature n and $theta$ is the parameters of the model.

[1]P. Harrington, Machine learning in action. Shelter Island, N.Y.: Manning Publications Co., 2012.

The Introduction of Machine Learning

Machine learning is a popular topic in the computer science area, but it can be used in many other different areas. For example, it can predict the probability that it might rain or classify the species of bird. Machine learning is a powerful technic which will be used in our future life. Machine learning is a good tool for us to insight data and asks the computer to help us deeply understand the data meaning [1]. According to Mitchell, the definition of the machine learning is shown as follow: “a computer program is said to learn from experience E, with respect to some task T, and some performance measure P, if its performance on T as measured by P improves with experience E.”. Three features in this definition are the class of the tasks, the measure of performance to be improved and the source of experience [2]. The experience E is like where the machine learns from. In general, the machine learning algorithm learns from the dataset. The task T is what we want the machine doing at the final or after training. The possible P is the possibility of machine completing the task or the accuracy of result from the algorithm. In the training process, a computer is more patient than people, so it can repeat analysis same data many times and may find a better choice than people thinking. There are two general classifications of the machine learning. One is supervised learning and another is unsupervised learning. The difference between these two types of machine learning algorithm is whether people teach the computer how to do something or tell the machine which choice is right. In other words, when the data is labelled, it is a supervised learning.  In unsupervised learning, there is no labelled data and the computer does not know what is right and it will find the answer by itself [1]. The detail of supervised and unsupervised learning will be discussed in a series of later article. Moreover, how to choose and improve the algorithm is another important thing when design the machine learning system.

[1]P. Harrington, Machine learning in action. Shelter Island, N.Y.: Manning Publications Co., 2012.

[2]T. Mitchell, Machine learning. Singapore: McGraw-Hill, 1997, pp. 1-4.

虚拟主机和云服务器的区别

虚拟主机是在一台实体主机的基础上,通过软件模拟出一台虚拟主机。 经过特殊的软硬件技术,把一台真实的物理电脑分割成多个逻辑存储单元,每个单元都没有物理实体,但每一个逻辑存储单元都能像真实的物理主机一样在网络上工作。将一台服务器的某项或者全部服务内容逻辑划分为多个服务器。

服务器是指一个管理资源并为用户提供服务的计算机软件,一般分为文件服务器,数据库服务器和应用程序服务器。运行以上软件的计算机或计算机系统也被称作服务器。云服务器就是用虚拟化的技术,从物理服务器分割成多个独立的服务器,是一个计算,网络和存储的组合。

相较于云计算,上面的更偏向于技术的运用产生的结果。云计算更像是一种服务模式,用分布式技术,虚拟化技术,网络技术,自动化运维技术和容灾技术共同服务客户。

阿里云的虚拟主机不能使用SSL证书和HTTPS。

Fast Fourier Transform

Fast Fourier Transform (FFT) is an algorithm to simplify the calculation of the Discrete Fourier Transform (DFT). In the traditional method, the calculation of the DFT is complicated and costs much memory of the computer. In the digital signal processing area, the superposition principle is important to deal with the problem and FFT use this principle well. FFT decompose the original DFT series into much smaller length DFT which can combine to original DFT after calculation. After decomposing, these smaller length DFTs may be calculated by using the direct methods or further decomposed into even smaller length DFTs. It can help the calculation of the DFT faster and easier. There are two basic classes of the FFT algorithms. One is decimation-in-time FFT and another is decimation-in-frequency FFT. In the following part, the basic principle of these two type FFTs will be discussed.

The principle of Decimation-in-time FFT is easy to understand. The basic operation is to divide the N-point DFT series into odd-numbered and even-numbered points series. After calculation, they can be easily recovered to original DFT by re-interleaving the two sequences. Using an example to illustrate this process can be understood easily. The mathematic equation does not present in this article, but the flow graph of 8-point DFT using the butterfly computation will be shown.  That figure shows the process of FFT.

The decimation-in-time FFT algorithm was divided the input sequence into a smaller sequence. Further, the output sequence also can be divided into small subsequence.  The difference between these two methods is which part is divided into the smaller sequence.

Introduction of convolution code

Convolution code which generates parity bits to the data is a powerful technique for correcting the random error. Moreover, in the convolution process, the shift register is used, so the output is a function of current and previous data. Constraint length K can be defined as the degree of memory a code includes. The rate =1/2 in WSPR means two output bits are generated for each input bit. Fig. 2 shows a simple convolution encoder example.

In the beginning, the shift register was initially loaded with 111 and the input bit stream is 11011. Input bits from the left put into the shift register and the output bits O1 and O2 are calculated by modulo-2 adders. From the Table I, the output is 10 01 00 01 10 10 10 for input bits 11011. Till now, the process of encoding convolution code is discussed. In order to understand how memory helps it to correct the random error, the state diagram of this example is shown in Fig. 3. There are four states of R1 and R2. R0 is the input bit and the number adjacent to the arrow is code output. According to Fig. 3, some transitions are not allowed because of the memory of the system, for example, state 10 to 00. The output data stream corresponds to a set of allowed transitions between the states. If an error is introduced during the transmission then the decoding process would result in non-permitted transitions, which means an error has been introduced. Next, the convolution decoder uses soft decision to determine the correct message. After this process, the random error would be corrected.

建站小结

今天来总结一下我最近编写自己网站的心得。

我写网站的初衷就是为了有一个可以记录表达自己的地方。所以一切都是遵守从简的原则,写了一个扁平化的站点。 扁平化站点就是简单的header和footer,其中导航在header里面。主页写上简单的题目和简介,简单明了,布局简单。很适合像我这样的网页小白尝试。

第一步,我建议可以去网上学习一下基础的html和css是什么。HTML (Hyper Text Markup Language) 超文本标记语言,不是一种编程语言,是一种标记语言。 标记语言是一套标记标签。HTML用标记标签描述网页。只要注意好了基本的标签意义就可以了。 Css是一个对于布局很有用的文件,可以自行百度出很多有用的东西。 如果你想和我一样做这种扁平化的博客作为入手,网上有很多教学视频,你可以边学边做。这会对你的。

第二步就是建立你的主页,内容页,目录页等,我用的是Hbuilder来写的我的网站。做好链接。

NB:注意分类一开始就做好,活用js文件。js文件可以让你省下很多繁杂重复的工作。也可以给你的网站增添很多的东西,一定要好好的研究。
这时候就要给大家推荐一个大神的网站 carlzhang.net。也可以直接点页脚里面的鸣谢链接直接过去。